Thoughts on AI innovation and open source ecosystem development triggered by DeepSeek

Written by

Clara Bennett

Updated on:June-19th-2025

At an important moment when competition in the field of artificial intelligence (AI) is fierce, DeepSeek has released basic large-model products such as V3/R1, which are comparable to the performance of OpenAI, an international leading institution. This not only demonstrates the strength of scientific and technological innovation in China's AI field, but also provides an innovative path from China for the development of global AI: first, low-cost training and reasoning, breaking the monopoly of high-end computing power and lowering the threshold for R&D and application; second, full-stack and full-series open source, supporting on-demand autonomous deployment, and benefiting all walks of life. This technological innovation and open source practice from China is worth learning from. This article will summarize the innovation of DeepSeek's open source model from three steps: "soft to supplement hard", "open source communication", and "ecological priority". At the same time, it also analyzes the problems and risks that my country's AI open source innovation still faces from three aspects: large model entry, open source software supply chain, and open source infrastructure. Finally, from the four perspectives of large model operating system innovation, software supply chain guarantee, open source infrastructure construction, and software and hardware coordinated development, suggestions are put forward to strengthen my country's AI innovation and open source software basic capabilities.

Around the Spring Festival of the Year of the Snake in 2025, the open source big model released by Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd. (hereinafter referred to as "DeepSeek") attracted widespread attention at home and abroad. First of all, the model benchmark performance is comparable to the world-leading OpenAI closed-source model GPT-4o. Secondly, the training cost is greatly reduced compared to other models, and the reasoning model R1 with thinking chain and its distilled version can be deployed on a variety of computing power devices. Finally, its code, documents, model weights, etc. are completely open source under the MIT license (an extremely loose open source license) . This set of "combination punches" that integrates high performance, low cost, and open source has made DeepSeek the focus of the field of artificial intelligence (AI) at home and abroad in a short period of time . The subsequent promotion and deployment in various industries has made the application of big models in China truly "fly into the homes of ordinary people."

The big model is a kind of software in form. Although the model file is generated through training, and the parameters and data are iterated, the results are output probabilistically, and it cannot be debugged accurately at breakpoints, and has obvious black box characteristics; but like traditional software, it is replicable and reusable, and requires an operating system to provide an operating environment, a storage system, and needs to process user input and output feedback. Therefore, the DeepSeek big model, a technological innovation and open source practice from China, also provides a model for the Chinese software industry to conduct in-depth analysis and learn from.

This article summarizes DeepSeek's innovation model as "software to supplement hardware", "open source communication" and "ecology first". At the same time, it also analyzes the problems and risks that my country's AI open source innovation still faces from three aspects: ecological entrance, open source software supply chain, and open source infrastructure. Finally, from the four dimensions of large-model operating system layout, software supply chain security, open source infrastructure construction, and coordinated development of software and hardware, it puts forward suggestions for strengthening my country's basic scientific and technological capabilities, in order to better support the long-term progress and development of China's innovation teams and continuously seize the global technological commanding heights in the fields of AI and software.

Analysis of DeepSeek's innovative model

"Softening the hardware" to open up a large-scale model innovation path

In the context of limited computing resources, DeepSeek has significantly reduced its reliance on hardware investment while maintaining high performance through software architecture innovation and algorithm optimization, and has provided global developers with a reproducible and affordable "soft to hard" technical solution. This has brought about a turning point in the scaling law that has been widely praised in the field of large models in recent years , and a gap has appeared in the "high wall" of computing power monopoly built on large-scale hardware investment. The threshold for large-scale model research and application has been greatly lowered, and small and medium-sized enterprises, research institutions and even individuals with limited resources have ushered in the possibility of AI innovation and AI empowerment.

Software is often overlooked in this wave of big models. In fact, for scenarios with a certain hardware architecture and clear optimization goals, the overall benefits of software improvements are usually greater than those of hardware. In April 2018, Hannes and Patterson, winners of the 2017 Turing Award , gave a performance comparison of calculating the multiplication of two 4096×4096 matrices using different programming methods when they gave their award-winning speech at the Association for Computer Science (ACM) . The data was quoted from the article There's plenty of room at the Top: What will drive computer performance after Moore's law? published in Science by Resers et al. of the Computer Science and Artificial Intelligence Laboratory (CSAIL) of the Massachusetts Institute of Technology (MIT) . The specific comparison data is shown in Table 1. As can be seen from the table, writing in C language is 47 times faster than Python, and the divide-and-conquer method can be accelerated by 6727 times after parallel optimization, while the SIMD instruction set can be accelerated by more than 60,000 times. Similarly, DeepSeek uses NVIDIA PTX, an intermediate code representation language between the CUDA high-level programming language and the actual GPU machine code, which also has a great acceleration effect.

Table 1 The speedup effect of multiplying two 4096×4096 matrices using different programming methods

In the past few years, Huawei's Hongmeng operating system has also adopted the method of "using software to supplement hardware". Under the limitations of processor process, it has maintained a good user experience on mobile phones through various software optimization methods such as operating system, compiler, and rendering engine.

More importantly, the software optimization solution laid the foundation for rapid dissemination. One of the major advantages of software over hardware is the convenience and speed of dissemination. It can quickly reach end users through online downloads. Imagine if DeepSeek released a hardware stacking solution like "Stargate" or used some kind of hardware acceleration solution (like the TPU designed by Google for deep neural networks) , it would be difficult to spread and promote it so quickly.

Achieving rapid user growth through open source

The core competitiveness of software is users. A large-scale, high-quality, and diverse user group is not only a solid foundation for the realization of software value, but also a strong driving force for the continuous iteration and innovation of software. As Bao Yungang, a researcher at the Institute of Computing Technology of the Chinese Academy of Sciences, said, under the open source model, the value calculation and dissemination effect of software follow Metcalfe 's Law , that is, the value of the network is proportional to the square of the number of users in the network. It is specifically manifested in two aspects. One is the user scale effect: the more users, the greater the value, the more feedback and improvements, and the richer the ecosystem. The second is the network effect: more developers participate, there will be more application scenarios, and then faster iteration speed. When many users become developers and testers, it will greatly reduce the cost of software development and testing, drive software upgrade evolution and value improvement, and then attract more developers to participate, forming a continuous virtuous circle.

As mentioned earlier, the big model itself is also a kind of software, so the development model once created by open source software can be fully reused by the big model. However, the DeepSeek open source model has created a miracle of user growth that is faster than traditional software. According to statistics, DeepSeek has topped the global download list of Apple App Store and Google Play Store for consecutive days, and the cumulative download volume has exceeded 16 million times in 18 days since its launch, far exceeding the 9 million downloads of Chat-GPT in the first month of its release. This is certainly due to the popularity of the big model concept, but it is also because DeepSeek has opened up model files, weight files, core codes and technical documents almost without reservation. As a result, it has attracted more than one million developers around the world in just half a year and established an active developer community, which not only contributed a large amount of code and tools, but also formed a spontaneous technical exchange and learning atmosphere, such as the awesome-deepseek-integration page maintained by DeepSeek on GitHub. This community-driven innovation model provides a strong impetus for the rapid iteration and application of AI technology. DeepSeek's experience also shows that even in the AI era, open source is still more competitive than closed monopoly.

Build upstream and downstream ecosystems with standardized interfaces and tools

DeepSeek has also demonstrated high efficiency in building an ecosystem. In just one month, DeepSeek R1 was rapidly deployed from the full-blooded version of 671 B to models of 70 B, 32 B, 7 B and even 1.5 B. From cloud service providers, Internet giants, state-owned enterprises, universities and research institutes to street offices, laboratories, individual users, from manufacturing to services, from education to medical care, DeepSeek has penetrated into all walks of life, promoting efficiency improvement and intelligent transformation.

Behind the rapid growth of the ecosystem is the standardization of calling interfaces and AI software toolkits, as well as the rapid gathering of upstream and downstream ecological partners. Standardized calling interfaces simplify the access process of AI applications, making DeepSeek easily supported by large model service frameworks such as Ollama, vLLM, and SGLang, and also enabling large model entry applications such as ChatBox and AnythingLLM to quickly access DeepSeek. Standardized software toolkits have greatly reduced the threshold for AI application deployment, while also providing a wealth of pre-trained models and data sets, allowing developers to achieve their own business needs through domain fine-tuning and retrieval enhancement generation (RAG) and further develop application innovation; at the same time, it allows other non-NVIDIA chips such as Huawei Ascend and Cambrian to quickly complete adaptation, forming a scene of coordinated adaptation of domestic software and hardware.

From a more macro ecological perspective, DeepSeek has established a de facto large model standard in China. Since the release of Chat-GPT at the end of 2020, both the United States and China have entered a "hundred-model war" pattern. Although OpenAI has led the development and established de facto standards such as Prompt Engineering , it has chosen a closed-source strategy and its largest investor, Microsoft's Windows operating system, is also closed-source, making it impossible for participants in the "application-model-system-hardware" ecological chain to independently adapt large models and systems, hindering the participants' willingness to participate and innovation motivation. For example, for a large number of non-NVIDIA hardware acceleration card manufacturers, because they cannot modify the basic model and related code, they can only simulate and translate the NVIDIA GPU instruction set, and cannot achieve native adaptation with the model; for cloud platform service providers such as Amazon, Google, and Alibaba, due to the competitive relationship with Microsoft Azure, they cannot achieve full business integration with OpenAI.

After DeepSeek was released as an open source, it was not only integrated with applications such as WeChat and WPS, but also integrated with services such as Huawei Cloud, Alibaba Cloud, and Tencent Cloud. It was also natively adapted to hardware such as Huawei Ascend, Cambrian, Muxi, Haiguang, and Shenwei, and even a large number of locally deployed all-in-one solutions. With DeepSeek as the de facto standard for large models, China is forming an ecological aggregation of the entire link of "application-model-system-hardware". In the long run, this change will surely reshape the development pattern of AI in China and even the world.

Risks and challenges facing open source innovation in AI in my country

While seeing the success of DeepSeek, we also need to see some of the risks and challenges currently facing China's AI open source innovation.

Risks of Model Entry Programs

The so-called big model entry program, for deployers, refers to big model service framework programs such as Ollama, SGLang, and vLLM, which are used to start the big model service process; for users, it means encapsulating multiple big model services to provide users with more convenient, easy-to-use, flexible and configurable interactive interface programs, such as ChatBox, AnythingLLM, etc.

Large model service frameworks, such as Ollama, usually appear as network daemons when starting large model services. They open a port and listen for service requests from the network. Once such daemons have vulnerabilities, attackers can easily invade the service host through the service port. In fact, exploitable large model service vulnerabilities caused by Ollama have been discovered recently.

As for the entry programs for user interaction, although ChatBox and others have proved the security of their own programs through open source, they cannot prove the security of user privacy data. After all, all conversation information will be forwarded and intercepted by the entry program.

The control and dominance of mainstream entry programs will become one of the focuses of big model competition. But so far, the entry programs of big models still run on the existing mainstream operating systems. Therefore, the risk of the operating system being uncontrollable will extend to the big model entry programs. After all, the operating system determines to a large extent who can become the entry point. The defeat of Netscape's NetScape browser in the competition with Microsoft's IE browser in the 1990s is a lesson for us.

Security and reliability risks of software supply chain

The development of DeepSeek relies on a large number of open-source or closed-source components. For example: the PyTorch deep learning framework and CUDA GPU acceleration library in the basic framework; the Megatron-LM distributed training framework and Flash Attention efficient attention mechanism related to training; the FasterTransformer inference acceleration engine, TensorRT inference optimization library, and ONNX model conversion standard library related to inference optimization; the version control Git and containerized deployment Docker in the tool chain; the NumPy numerical calculation library, pandas data processing library, and HuggingFace dataset management tool in data processing.

The above is only based on public information. There may be more tools in actual use, and some proprietary tools may not be disclosed. In these highly interdependent software supply chains, some key links are still controlled by international competitors such as Meta (such as the PyTorch development framework and the Ollama entry program mentioned above) , or belong to a company's private products (such as NVIDIA CUDA) , and there is a possibility of supply interruption. In addition, according to the latest report from Qi'anxin, some supply chain forgery or poisoning attacks have appeared specifically targeting DeepSeek. These all constitute the security and reliability risks of the software supply chain faced by China's AI.

A healthy large-scale model ecosystem requires an equally healthy open source software ecosystem. Careful sorting and continuous maintenance of the software supply chain, especially the key nodes of the open source software supply chain, is still an investment that companies, industries, and even countries must make to achieve high-level technological self-reliance in artificial intelligence.

The risks of open source infrastructure

Not only DeepSeek, but almost all major domestic open source large-scale model projects choose to be released on the GitHub platform owned by Microsoft Corporation in the United States. This is because GitHub has the highest concentration of global developers, complete open source infrastructure capabilities, mature collaborative tool chains, and a growing social network of programmers. Therefore, it has a greater international influence and is more conducive to project promotion. However, choosing GitHub will also face challenges and risks in the future, including but not limited to geopolitical risks, data sovereignty issues, potential access restriction risks, etc. This is not a problem between DeepSeek and domestic open source project maintainers, but a lack of open source infrastructure in China that can compete with GitHub. In terms of the degree of facility perfection, the scale of developer aggregation, the degree of internationalization, and operational capabilities, there is a large gap between the existing domestic infrastructure and GitHub.

In recent years, Hugging Face has emerged as a new force with the explosion of big models and has become the world's most popular model hosting platform. Although domestic platforms such as Alibaba Magic have started and have begun to take shape, compared with Hugging Face, there are also significant differences in functions, scale, internationalization, and operations.

Suggestions for strengthening my country’s AI innovation capabilities

Based on the above analysis, this article puts forward the following suggestions to strengthen my country’s AI innovation capabilities.

Start the research and development of large model operating systems as soon as possible

The big model still exists in the existing operating system ecosystem in the form of software. Although new entry programs such as ChatBox have emerged, they are not enough to shake the ecological dominance of Windows, iOS, and Android. Apple and Huawei have successively proposed intent-oriented development frameworks to integrate the capabilities of big models and continue to control user entry. Microsoft has consolidated its monopoly in the desktop field by pre-installing Copilot and deeply bundling it with office suites, browsers, etc. The Chen Haibo team of Shanghai Jiaotong University proposed three technical routes for big model operating systems, namely the progressive route (big model as an operating system plug-in component) , the radical route (big model as an operating system) , and the fusion route (big model and operating system are deeply integrated) , and recommended the adoption of the fusion route, so as to maximize compatibility with the existing operating system application ecosystem while utilizing the capabilities of big models. In view of the leap in machine intelligence and the change in interaction paradigms brought about by big models, no matter which route is adopted, the research and development of big model operating systems is imminent. As big models and operating systems develop separately, different technical routes will naturally merge. However, once the window of opportunity for the initial construction of the ecosystem is missed, a new and more difficult ecological monopoly will be faced.

Strengthening open source software supply chain governance

Open source software has become the "raw materials" and "components" for assembling large and complex system software. A Linux open source operating system distribution (such as Debian, openEuler, etc.) often contains tens of thousands of open source components, which are compiled and assembled through the mutual dependencies of these components. A large model also relies on large and small open source components from development, training to deployment, operation, and reasoning. As large models become strategic basic software like operating systems, the guarantee of their open source software supply chain is essential. The Institute of Software of the Chinese Academy of Sciences launched the "Open Source Software Supply Chain Lighting Plan" in 2019 to sort out the global open source software knowledge graph, identify the key supply chain nodes of large and complex basic software such as operating systems, and continuously cultivate high-level talents who can take care of key open source software through activities such as "Summer of Open Source". It is recommended to continuously sort out the open source software supply chain around the open source component dependencies of large models, focus on the key nodes, invest or cultivate corresponding human resources, and ensure the ability of continuous open source maintenance.

Accelerate the construction of open source infrastructure comparable to GitHub and Hugging Face

Faced with the monopoly of GitHub and Hugging Face hosting platforms, on the one hand, we should continue to improve the existing domestic code hosting platforms, enhance platform stability and functional completeness, and optimize the developer experience. On the other hand, we should also have a transition strategy, adopt a multi-platform synchronization strategy, and establish a strategic backup mechanism. Since 2019, the Institute of Software of the Chinese Academy of Sciences has started to build the "Source Map" open source software supply chain infrastructure. So far, it has formed a full backup of key global open source software, and provided platform services such as trusted software warehouses and trusted compilation and construction environments. In the future, it is also necessary to accelerate the creation of a new generation of open source development infrastructure for new needs and new scenarios of large models, and gradually cultivate the local open source infrastructure ecosystem with domestic advantages. In addition, it is necessary to attract foreign institutions and developers to participate in a more open source and open model to jointly hedge against potential geopolitical risks.

Increase open source software and hardware collaboration

Under the background of the new US government's escalating control and pressure, NVIDIA GPU hardware supply restrictions and CUDA software ecological barriers have become one of the main obstacles facing China in achieving high-level scientific and technological self-reliance in the field of AI. For example, PTX used in DeepSeek training optimization still belongs to the CUDA ecosystem. It is recommended to increase the coordination of software and hardware under the RISC-V open source instruction set, especially the coordination of AI-related extended instruction sets. The rise of the RISC-V instruction set is not only to break the ecological monopoly of x86/ARM at the instruction set level, but also to break the monopoly of NVIDIA GPU private instruction sets and private operators. With the formulation and improvement of the RISC-V vector instruction set and matrix/tensor instruction set, the new software and hardware interface standard specifications are expected to replace the CUDA private interface specifications, and cooperate with compilers to achieve software and hardware coordination on RISC-V dedicated AI accelerator cards. Once a RISC-V accelerator card surpasses NVIDIA's flagship GPU in terms of performance-to-power ratio, the entire RISC-V ecosystem will also usher in the "DeepSeek moment".

It should be emphasized that the above risk analysis and suggestions are not intended to form a closed, defensive technology system, but to provide China and the world with more open and open options, to enable equal participation in the research and development and application of new AI technologies, new products, and new services, and to jointly build a community with a shared future for mankind in the AI era.