7 reasons why DeepSeek surpassed OpenAI with only 5% of the budget

Written by
Caleb Hayes
Updated on:July-17th-2025
Recommendation

How DeepSeek achieved technological breakthroughs and commercialization with a 5% budget, subverting the traditional model of the AI ​​industry.

Core content:
1. MoE architecture: DeepSeek's energy-saving lamp mode reduces 90% of computing power costs
2. Reasoning transparency: DeepSeek's engineer-friendly design improves developer trust
3. Localized deployment: DeepSeek's consumer-grade hardware operation ends the reliance on sky-high graphics cards

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)


In the field of AI, high R&D and operating costs have always been a pain point in the industry. However, DeepSeek (DeepSeek-V3 in-depth analysis: a comprehensive interpretation of the next generation of AI models) broke this shackles with amazing efficiency - using only 5% of OpenAI's budget , it achieved technological breakthroughs and commercialization. This article reveals the seven core strategies behind it and demonstrates the power of disruptive innovation.

1. MoE architecture: precisely activated “energy-saving lamp” mode

OpenAI's model activates all parameters at inference time, resulting in high computational costs. This is like turning on the lights of an entire skyscraper every time you need to find something, even if you only need to find something in one room. This full activation strategy consumes a lot of energy and money.

In contrast, DeepSeek uses a sparse activation strategy, activating only some parameters for each task, which significantly improves efficiency. This optimization allows DeepSeek to significantly reduce computational costs while maintaining high performance.

    • Cost comparison :
      OpenAI: full parameter activation → ???
      DeepSeek: sparse activation → ?
      This strategy directly reduces the computing power overhead by 90%, laying the hardware foundation for low cost.

2. Reasoning transparency: Say goodbye to “black box” engineer-friendly design

OpenAI's models are often seen as "black boxes" and the decision-making process is difficult to explain. DeepSeek (DeepSeek-R1 distillation model and how to run DeepSeek-R1 locally with Ollama) provides transparent steps in the reasoning process, especially in mathematical and programming tasks, showing the reasoning process step by step, making it easier to debug and enhance user trust:

    Example comparison :
    OpenAI: Input question → Directly output answer (? Unable to trace logic)
    DeepSeek: Input question → Step-by-step deduction → Final answer (? Transparent and auditable) This not only improves developer trust, but also increases debugging efficiency by 3 times, greatly reducing subsequent maintenance costs.

3. Localized deployment: ending the reliance on “expensive graphics cards”

DeepSeek (DeepSeek R1: Open Source Pioneer in a New Era of AI Reasoning) can run efficiently on consumer-grade hardware without relying on expensive cloud resources. This not only reduces costs, but also enhances data privacy because data can be processed locally :

    • Hardware requirement comparison :
      OpenAI: (10 H100s → $300,000)
      DeepSeek: (2 RTX 4090s → $3,000)
      The cost is reduced by 99% , and it supports local data processing to avoid privacy risks of cloud services. This innovation even shook NVIDIA's monopoly business model.

4. Three-stage training method: cutting out redundant manpower and computing power

DeepSeek's training pipeline is divided into three stages: cold start fine-tuning, reasoning reinforcement learning, and rejection sampling and final fine-tuning. Unlike OpenAI, which relies on a lot of manual feedback and expensive supervised training, DeepSeek significantly reduces training costs through rule rewards and automated reasoning reinforcement learning :

    1. Cold start fine-tuning : Replace massive annotations with high-quality thought chain datasets, saving 80% of supervised training costs.

    2. Rule-based reinforcement learning : Replace human feedback with hard indicators such as mathematical correctness and code pass rate, saving millions of dollars in labeling costs.

    3. Rejection sampling optimization : Automatically select the best answer to fine-tune the model, avoiding the accumulation of generalization errors of OpenAI.
      The total training cost is only 1/20 of OpenAI , and it produces more accurate vertical field models.

5. Rule Reward System: Abandon Expensive “AI Supervisors”

OpenAI needs to train the neural reward model to evaluate the results, which is like hiring an "AI supervisor", which not only increases computing power consumption but also may be "cheated" by the model (Reward Hacking).
DeepSeek (RAG system (including code) developed based on DeepSeek R1 and Ollama) directly adopts regularized rewards (such as +10 points for passing the code test), achieving zero additional training costs. Experiments show that this method has an accuracy rate that exceeds OpenAI by 15% in STEM tasks.

6. Open Source Ecosystem: Global Developers’ “Free R&D Army”

DeepSeek (DeepSeek Janus-Pro: Breakthrough and Innovation in Multimodal AI Models) makes full use of open source tools and community contributions to avoid expensive proprietary technology and tool dependencies. Through open source, DeepSeek not only reduces R&D costs, but also speeds up iterations, and further improves efficiency through community-driven benchmarking and problem solving :


    • Dataset : Use open corpora such as Common Crawl to save on sky-high data licensing fees.

    • Model iteration : The community contributes code, fixes vulnerabilities, and replaces high-paid engineering teams.

    • Hardware adaptation : Developers spontaneously optimize support for different GPUs to reduce compatibility costs.
      According to statistics, the open source ecosystem saves 70% of its R&D expenses and increases iteration speed by 3 times.

7. Accurate cost flow: every penny is spent on the cutting edge

Compare the capital flows of the two:

    • OpenAI : Human labeling → Reward model training → Giant GPU clusters → General model → 

    • DeepSeek : Rule engine → no intermediate evaluation layer → small GPU → vertical model → 
      By cutting out redundant links, DeepSeek reduces the inference cost to 1/40 , achieving a key breakthrough in commercialization.

DeepSeek (In-depth Analysis of DeepSeek R1: The Synergistic Power of Reinforcement Learning and Knowledge Distillation) has successfully surpassed OpenAI in cost control through innovative architecture design, transparent reasoning process, local execution capability, efficient training pipeline, rule-based reward mechanism and other strategies. These strategies not only make DeepSeek comparable to OpenAI in technology, but also give it a significant advantage in cost-effectiveness.