“After Qwen3, I really dared to invest in AI applications”

Written by

Iris Vance

Updated on:June-25th-2025

In the early morning of April 29, 2025, 5 hours before the official release of Qwen3, developers on X and GitHub have already moved their stools to wait for the release. Technology enthusiasts in the open source community stay up late to refresh, waiting to be the first to test and experience this highly anticipated new Chinese model.

Even Chinese developers stayed up all night to test and publish evaluation reports as soon as the release was released at 5 a.m.

This phenomenon has only attracted such attention before OpenAI released a new model. Today, China's large model can also arouse such eager expectations from developers around the world, not only because of the breakthrough in model technology, but also because it can be truly used by developers and enterprises to generate industrial value.

"When choosing models for enterprises in China, there are basically only two options: Qwen and DeepSeek. However, the parameters of R1 are too large, and many scenarios do not require such strong performance. In contrast, the Qianwen series provides a full range of parameter scales from small to large, and suitable models can be found in any scenario." Zhai Xingji, founder of Agent digital employee application company Yuhe Technology, told us.

Especially after the release of Qwen3, the model size has been further broadened from 0.6B to 235B , and the deployment and inference costs have been further reduced. The threshold for enterprises and developers has been lowered again, laying the foundation for the explosion of the application ecosystem.

"After Qwen3, I really dared to invest in large-scale model applications." An investor confessed to Silicon Star: "It is true that the computing power on the end side is insufficient. If you blindly use cloud models and cannot deploy them locally, many functions will be restricted and users will also worry about privacy issues."

The first stage of the big model competition has passed. After the gold rush, there are not many models with sufficient performance that companies and developers can choose, and Qwen seems to have become the first choice in the Chinese environment.

Modeling with the service industry as the goal

Looking back at the development history of the Qwen series models, we can find the core difference between them and other large models: they are not simply pursuing technological leadership, but are guided by serving the actual needs of the industry .

To be “precise” or to be “comprehensive”, DeepSeek and Qwen represent the directions of these two technologies.

The Qwen series of models adopts a "full spectrum" layout strategy to provide targeted solutions for different scenarios. In terms of parameter scale, Qwen 3 covers everything from lightweight 0.6B, 1.7B, 4B, 8B, 14B, and 32B dense models to 30B-A3B and 235B-A22B hybrid expert models, covering all deployment requirements from the end to the cloud. In terms of model types, the larger Qwen series not only has basic language models, but also includes full-modal capabilities such as reasoning models, multimodal visual understanding (VLM), image generation, and video understanding.

Zhai Xingji pointed out: "Its model series covers a wide range, from text to VL multimodal recognition to reasoning models. You will find that it has the whole set. It explored Q VQ very early , which is a visual reasoning model."

This full-spectrum layout enables all types of companies to find models that suit their scenarios. For example, Li Yong, founder of Yueran Innovation, a children's smart toy startup, told us: "Previously, it was impossible to deploy inference models on the end due to limitations in chip performance, cost, and power consumption. But the launch of Qwen 3-0.6B makes end-side deployment possible. End -side deployment means no need to connect to the Internet, which solves privacy issues and eliminates network restrictions and token billing costs."

Looking at a wider range of application scenarios, Qwen provides precisely matched solutions for various types of terminal devices: the 0.6B and 1.7B models support developers for speculative decoding and small terminal deployment ; the 4B model is suitable for mobile phone application optimization; the 8B model can be designed for

Zhai Xingji explained: "When we deploy for customers, if we want to infer the model, I will definitely consider the resource situation of most customers. Many people will choose the inference model based on the traffic version of Qianwen 32B, or directly use QWQ - 32B."

In contrast, Llama was open sourced earlier than Qwen, but it gradually became the second choice for open source in subsequent development. First, it has obvious shortcomings in parameter scale selection. Llama's large models, such as 400B-500B parameter scale, are difficult for enterprises to deploy and require huge computing resources; and the 70B parameter model is generally considered by developers to be insufficient.

In contrast, Qianwen 72B is considered to have just reached the maximum parameter scale that enterprises can afford, achieving a balance between performance and cost. Another developer explained: "According to our calculations, enterprises can only deploy a 72B model at most, and cannot deploy anything larger than that."

Secondly, Llama is obviously lacking in multilingual capabilities, especially in Chinese. "In the Chinese context, Llama has relatively little corpus data, only 5% of which is multilingual data and 95% is English data. So people think it is a bit stupid in Chinese scenarios." This leads to Llama's poor adaptability in global application scenarios, especially for Chinese developers, and its practical value is greatly reduced.

The success of this strategy is reflected in the wide recognition Qwen has gained in the open source community: the number of derivative models worldwide has exceeded 100,000, with more than 300 million downloads, accounting for more than 30% of the global model downloads in the HuggingFace community in 2024. In the Huggingface global open source model list in February 2025, the top ten open source models are all based on Qwen secondary development.

Provide BaseModel for pre-training Agent

"If an agent does not have multimodality, it will definitely have no future." The above investor said: "At the current stage, building an intelligent agent requires the model to have strong multimodal understanding, reasoning and autonomous decision-making capabilities." When agents have become the core of the next generation of applications, building efficient agents requires strong underlying model support.

After many attempts to cultivate agents in the manufacturing industry, Zhai Xingji believes that we have reached a critical node for agents. "The agents we are making now, such as Manus, must still have a hand-built workflow behind them . If the attempt is no longer a fixed process, it is intelligently judged . I need to find this person to do something, I need to place an order, I need to find this person to verify, I need to cancel, then in this scenario, it is necessary to make a pre-trained agent model based on a powerful base model."

"First, it should be an inference model, which is the basis. Then, we retrain the inference model, mark the path data of the first, second, and third steps of the entire task, the thinking data at each step, the thinking data on why we should do this, and the final result data. After marking them, we use reinforcement learning to iterate continuously."

This training method essentially teaches the model how to decompose tasks, how to think, and how to use tools. Qianwen, as a basic model, provides powerful language understanding capabilities and a logical reasoning framework. Zhai Xingji further added: "Until now, we have been writing reasoning templates ourselves and letting the big model follow this reasoning template step by step. But in the future, we hope that the Agent will come up with a reasoning template in one step, without us providing it. It will think on its own and form a reasoning architecture and path template on its own. This requires a high level of capability from the basic model."

Qwen3's hybrid reasoning capability provides a more flexible thinking and decision-making framework for agent development. In reasoning mode, the model will perform more intermediate steps, while in non-reasoning mode, the model can quickly follow instructions to generate answers. This ability is very similar to human thinking: quick answers to simple questions and careful consideration of complex questions. This hybrid reasoning ability is particularly important when developing intelligent agents.

Qianwen has further lowered the development threshold by combining Qwen-Agent with MCP (Model-as-Copilot Platform). This innovative move enables developers to quickly build intelligent applications at a lower cost. Traditional Agent development requires professional AI engineers and a lot of resource investment, while the combination of the Qwen-Agent framework and the MCP platform creates a "low-code" Agent development model. Developers only need to define the task flow and tool set, and the system will automatically handle the complex reasoning process and execution path. This approach greatly simplifies the development process, allowing ordinary developers without a deep AI background to build powerful intelligent applications.

As the operating environment of Agent, MCP platform provides infrastructure such as tool calling, permission management, and data processing, while Qwen-Agent focuses on intelligent decision-making and reasoning capabilities. In the past, building an Agent that can handle customer service may require a development cycle of 1-2 months and a professional AI team. Now, through the combination of Qwen-Agent and MCP, an ordinary developer may only need 1-2 weeks to complete prototype development and achieve a higher quality interactive experience.

This approach of lowering the threshold has made AI application development more "popularized," allowing more small and medium-sized enterprises and individual developers to participate in smart application innovation, which will lead to an explosion of applications.

Open source is not a slogan, it is a way of survival

Everything open today is actually built on the basis of true open source.

"Qwen is really generous. They open-source their best performance models ," said Zhai Xingji.

From the smallest 0.6B to the largest 72B, and then to the new generation 235B MoE model, all specifications are open source; instead of retaining the best model as a closed-source product, the best model at each level is fully open source; continuous updates and iterations, continuous introduction of new models and capabilities, from text to multimodality, from dialogue to reasoning, all-round openness.

Instead of keeping the best models as closed-source products, the best models at each level are fully open-sourced. In fact, base model vendors usually open-source small models with limited performance and keep high-performance large models as paid API services, thus forming a complete commercial closed loop, which is a common open-source model in the industry.

Qwen's full-size models are open to the community without any capability degradation or functional restrictions. They include not only pre-trained models, but also SFT fine-tuned versions, dialogue versions, and instruction optimization models in various professional fields, providing developers with a ready-to-use solution while allowing developers to make in-depth modifications and secondary development, rather than the "semi-open source" model that only provides limited access rights like some manufacturers. This unreserved open attitude is no longer an open source strategy, but the basis for survival.

"By mid-2024, when the multimodal model began to mature, Qianwen was the first to promote the VL model. Qianwen 2.0 started to have a multimodal model, and 2.5 had a stronger multimodal model. Llama 3.2 only supported image recognition, which was too late." Zhai Xingji recalled that the Qwen team was "too eager."

The openness of the model and cloud services form a good closed loop. As the No. 1 cloud vendor in China, Alibaba Cloud needs more customers to use it and provide MaaS services. After it builds an open source ecosystem and develops brand awareness, if a closed source model is needed, it will naturally choose Qianwen.

Another entrepreneur in the B2B field said: "If we are developing applications in China now and can use the cloud, we will definitely give priority to the cloud, which has no operation and maintenance costs and no deployment costs. But if the customer insists on privatization, then we will choose open source model deployment, especially in some special industries, such as finance, government and medical fields, which often require completely private deployment due to data security and compliance requirements."

Globally, Alibaba is the only company that can form a virtuous closed loop between models and cloud. Microsoft chose to cooperate with OpenAI to provide services, and AWS chose to cooperate with Anthropic.

From technology to industry, from research to application, the Qwen series has not only won the favor of developers through a comprehensive open source strategy, but also found its place in the actual business environment. Open source is not just about sharing code, it is a way to build an ecosystem, a bridge connecting developers and enterprises, and the foundation for survival and development in the fierce competition of large models.