Open-Source AI Models Close the Gap to Proprietary Systems
The performance gap between open-source and proprietary AI models has collapsed from 17.5 percentage points in 2023 to near zero in Q1 2026. Three model families now compete directly with the best commercial systems, while inference costs have dropped by more than 99%. For European enterprises, this shift creates new options for data sovereignty, cost control and regulatory compliance.
Open-source AI models have reached parity with proprietary systems in Q1 2026. DeepSeek V3.2 (671B parameters, MIT licence) surpasses GPT-5 High on the AIME-2025 math benchmark at 96.0% versus 94.6%, at one-tenth the API cost. Mistral Small 4 from France offers 119B parameters with only 6B active per token under Apache 2.0, giving European enterprises an EU-based provider option. Qwen 3.5 demonstrates that even a 9B model can beat much larger systems on reasoning tasks, scoring 81.7% on GPQA Diamond versus 71.5% for GPT-OSS-120B. However, security risks are real: the OSSRA 2026 report found 581 vulnerabilities per open-source codebase on average, and 175,108 Ollama instances are publicly exposed. The EU AI Act does not exempt open-source models from core obligations, with full GPAI provider requirements taking effect from August 2026. European enterprises should start pilot projects now while establishing security and compliance processes.
The Performance Gap Has Virtually Disappeared
In 2023, the best open-source LLMs trailed proprietary models by an average of 17.5 percentage points across standard benchmarks. By Q1 2026, that gap has effectively closed. Three model families now match or exceed commercial systems like GPT-5 and Claude on most tasks: DeepSeek V3.2 from China (MIT licence), Mistral Small 4 from France (Apache 2.0), and Qwen 3.5 from China (Apache 2.0).
This convergence is not merely about raw benchmark scores. It extends to practical capabilities: code generation, mathematical reasoning, multilingual comprehension and long-context processing. At the same time, inference costs have dropped by more than 99% in two years, making these models economically viable for production workloads that were previously reserved for well-funded organisations with proprietary API contracts.
Sources: California Management Review, Berkeley, January 2026; InfoQ DeepSeek V3.2 analysis, January 2026
DeepSeek V3.2: Reasoning Power Under MIT Licence
DeepSeek V3.2 is a 671 billion parameter mixture-of-experts (MoE) model released under the MIT licence. Its Speciale variant achieves 96.0% on the AIME-2025 mathematics benchmark, surpassing GPT-5 High at 94.6%. On the SWE-Bench coding benchmark, it scores between 72% and 74%, placing it among the top-performing models for real-world software engineering tasks.
The cost advantage is equally significant. DeepSeek V3.2 API pricing sits at approximately $0.028 per million input tokens, roughly one-tenth of GPT-5. For enterprises running high-volume inference workloads, this difference translates directly to budget savings without sacrificing output quality. The MIT licence permits commercial use, modification and redistribution without restrictions.
Mixture-of-Experts (MoE)
An architecture where a large model is divided into specialised sub-networks ("experts"). For each input, only a small subset of experts is activated, reducing computational cost while maintaining the capacity of the full model. DeepSeek V3.2 has 671B total parameters but activates far fewer per token.
DeepSeek V3.2 demonstrates that open-source models can lead on both performance and cost. At $0.028 per million input tokens, it is ten times cheaper than GPT-5 while matching or exceeding it on mathematical reasoning benchmarks.
Mistral Small 4: A European Model
Mistral Small 4 is a 119 billion parameter MoE model from Mistral AI, a French company. Only 6 billion parameters are active per token, using 128 experts with 4 active at any given time. It supports a 256,000 token context window and is released under the Apache 2.0 licence. Compared to its predecessor, it delivers 40% less latency and three times the throughput.
For European enterprises, Mistral's origin matters as much as its technical specifications. As an EU-based provider, Mistral simplifies compliance with GDPR and the EU AI Act. Data does not need to leave European jurisdiction, and the company operates under European regulatory oversight. This makes it a practical choice for organisations that need to balance AI capability with data sovereignty requirements.
Mistral Small 4 is not just a technical alternative. It is the first open-source model where European enterprises can use an EU-based provider without compromising on performance.
Based on Mistral AI product announcement, March 2026Qwen 3.5: Small Models, Big Impact
The Qwen 3.5 family from Alibaba Cloud demonstrates that model size is no longer the primary predictor of performance. The 9B parameter variant scores 81.7% on the GPQA Diamond benchmark, beating GPT-OSS-120B at 71.5%, a model thirteen times larger. The flagship 397B model (with 17B active parameters) supports 256,000 token context and more than 200 languages.
Even at the smallest end of the range, efficiency gains are striking. The Qwen 3.5 2B model achieves 66.5% on MMLU, compared to 45.3% for Llama 2 7B, a model more than three times its size. This makes on-device and edge deployment viable for tasks that previously required cloud infrastructure.
| Model | Parameters | Active Params | GPQA Diamond | MMLU | Licence |
|---|---|---|---|---|---|
| Qwen 3.5-9B | 9B | 9B (dense) | 81.7% | - | Apache 2.0 |
| Qwen 3.5 Flagship | 397B | 17B | - | - | Apache 2.0 |
| Qwen 3.5-2B | 2B | 2B (dense) | - | 66.5% | Apache 2.0 |
| Llama 2 7B | 7B | 7B (dense) | - | 45.3% | Llama 2 Community |
| GPT-OSS-120B | 120B | - | 71.5% | - | Open (OpenAI) |
Sources: VentureBeat Qwen 3.5 analysis, March 2026; MarkTechPost Qwen 3.5 Small Models, March 2026
What European Enterprises Gain
Open-source AI models offer European enterprises three advantages that go beyond cost savings: data sovereignty, provider independence and regulatory alignment. With Mistral as an EU-based provider and the option to run any open-weight model on-premises or through European cloud infrastructure, organisations can process sensitive data without it leaving EU jurisdiction.
The EU is actively building infrastructure to support this. Nineteen EU AI Factories are being established through the EuroHPC programme, providing access to GPU compute for training and fine-tuning. Tilde, a Latvian language technology company, has already demonstrated what this infrastructure enables: using 2 million GPU hours on the LUMI supercomputer in Finland, the company trained TildeOpen LLM, an open-source model optimised for Baltic and Nordic languages.
Data Sovereignty
Run models on-premises or on European cloud infrastructure. Sensitive data stays within EU jurisdiction, meeting GDPR requirements without complex cross-border data transfer agreements.
Provider Independence
No lock-in to US or Chinese AI providers. Open-weight models can be self-hosted, fine-tuned and switched between infrastructure providers without API dependency.
Regulatory Alignment
EU AI Act compliance is simpler when using EU-based providers like Mistral. Open-source models also allow direct inspection of model behaviour for transparency obligations.
Yet adoption remains slow. Only 14% of EU companies used AI in any form in 2024, according to Eurostat. The availability of capable, affordable open-source models removes one barrier, but skills gaps, infrastructure readiness and regulatory uncertainty continue to hold back broader deployment.
EU AI Act: What Applies to Open-Source Models
The EU AI Act does not treat open-source AI models differently from proprietary ones for core requirements. All providers of general-purpose AI (GPAI) models must comply with copyright obligations and publish a sufficiently detailed summary of training data. The systemic risk threshold is set at 10^25 FLOPs of compute used for training. Models exceeding this threshold, which includes the largest open-source models, face the full set of GPAI obligations.
Full GPAI provider obligations take effect from August 2026. This means open-source model developers must prepare for compliance within the same timeline as proprietary providers. For deployers, meaning the enterprises that use these models in their products and services, obligations apply regardless of whether the underlying model is open-source or proprietary.
Copyright + Data Summary
All GPAI providers must comply with copyright law and publish training data summaries
Full GPAI Obligations
Complete compliance required for all general-purpose AI model providers, including open-source
Systemic Risk: 10^25 FLOPs
Models trained with compute exceeding this threshold face additional obligations for risk assessment and mitigation
Deployer responsibility: Even if a model is released under a permissive open-source licence, the enterprise deploying it in a product or service bears full responsibility for compliance. An open-source licence covers intellectual property, not regulatory obligations. Enterprises must conduct their own risk assessments and maintain documentation.
Challenges and Risks
The availability of free model weights does not mean free or risk-free operations. The Black Duck OSSRA 2026 report reveals the scale of security challenges in the open-source ecosystem: the average codebase now contains 581 known vulnerabilities, a number that has doubled year over year. 87% of audited codebases had at least one vulnerability. These numbers apply to the broader open-source software ecosystem, but AI model dependencies, inference frameworks and tooling face the same patterns.
Infrastructure security is a particular concern. The Register reported 175,108 Ollama instances publicly accessible on the internet, meaning local AI model servers were exposed without authentication. Backdoor attacks on open model weights are difficult to detect because malicious modifications can be hidden within billions of parameters without affecting overall benchmark scores. Trend Micro has documented a growing supply chain attack surface in the open-source AI ecosystem, where trust in community-maintained repositories can be exploited.
Licence compliance adds another layer of complexity. While DeepSeek V3.2 uses MIT and both Mistral Small 4 and Qwen 3.5 use Apache 2.0, the dependencies, training data and fine-tuning datasets may carry different terms. The OSSRA report found that 68% of codebases had licence conflicts. For enterprises, this means legal review must accompany technical evaluation before any production deployment.
A free model is not a free deployment. The total cost of ownership includes infrastructure, security, compliance review, monitoring and the engineering team to maintain it all.
Based on Trensee Open Source AI Business Models analysis, March 2026What Enterprises Should Do Now
The window between model availability and regulatory enforcement is closing. Open-source models are ready for production use today, but deploying them responsibly requires preparation across technical, legal and organisational dimensions.
Priority Actions for Enterprise Leaders
- Start a pilot with a small model. Qwen 3.5-9B or Mistral Small 4 can run on modest GPU infrastructure and deliver results comparable to proprietary APIs for many tasks
- Develop a hybrid strategy. Use open-source models for on-premises processing of sensitive data and proprietary APIs for non-critical workloads where convenience outweighs control
- Check available GPU infrastructure. The 19 EU AI Factories through EuroHPC offer subsidised compute for European organisations. Private cloud and colocation options are expanding rapidly
- Establish security processes. Treat open-source model weights like any other software dependency: scan for vulnerabilities, verify provenance, restrict network exposure and monitor for anomalies
- Clarify regulatory obligations. Determine whether your use case falls under deployer obligations of the EU AI Act and begin documentation before the August 2026 deadline
- Evaluate EU-based providers like Mistral for use cases where data residency and provider jurisdiction simplify compliance
Open-source AI models have reached production quality. The question for European enterprises is no longer whether these models are good enough, but whether internal processes for security, compliance and operations are ready to support them.
Further Reading
Frequently Asked Questions
Three model families lead the field in Q1 2026: DeepSeek V3.2 (671B parameters, MIT licence), Mistral Small 4 (119B parameters, Apache 2.0), and Qwen 3.5 (up to 397B parameters, Apache 2.0). DeepSeek's Speciale variant scores 96.0% on AIME-2025, surpassing GPT-5 High at 94.6%. These models match or exceed proprietary systems on most standard benchmarks.
DeepSeek V3.2 API costs are approximately $0.028 per million input tokens, which is roughly one-tenth the cost of GPT-5. Overall inference costs across the industry have dropped by more than 99% in two years, driven largely by open-source competition and efficiency improvements in mixture-of-experts architectures.
Yes. The EU AI Act does not exempt open-source models from core requirements. Providers must comply with copyright obligations and publish training data summaries. Models exceeding the systemic risk threshold of 10^25 FLOPs face full GPAI obligations. Deployer obligations apply regardless of whether the underlying model is open-source or proprietary. Full GPAI provider obligations take effect from August 2026.
The Black Duck OSSRA 2026 report found an average of 581 vulnerabilities per open-source codebase, a figure that has doubled year over year. 87% of codebases had at least one known vulnerability. Additionally, 175,108 Ollama instances were found publicly exposed on the internet. Backdoor attacks on open model weights are difficult to detect, and 68% of codebases had licence conflicts that could create legal liability.
Mistral Small 4 is a 119B parameter mixture-of-experts model from Mistral AI, a French company. Only 6B parameters are active per token, using 128 experts with 4 active at any time. It supports 256k context and is released under Apache 2.0. For European enterprises, Mistral offers an EU-based provider option that simplifies GDPR and EU AI Act compliance, avoiding dependence on US or Chinese AI providers.
Start with a pilot using a small model such as Qwen 3.5-9B or Mistral Small 4 for a well-defined internal use case. Develop a hybrid strategy that combines open-source models for sensitive data processing with proprietary APIs for general tasks. Check available GPU infrastructure, including the 19 EU AI Factories through EuroHPC. Establish security review processes for model weights and dependencies, and clarify regulatory obligations under the EU AI Act before scaling.