Qwen3 Quick Review: PC Transformed into AI Workstation

Written by

Caleb Hayes

Updated on:June-23rd-2025

On April 29, 2025, Alibaba released a large model of Qwen3, which attracted widespread attention on social media at home and abroad. Here, we might as well discuss Qwen3 from the perspectives of technology and application.

According to the official announcement, the improvements of Qwen3 are mainly concentrated in the following aspects:

Model architecture : It uses a mixed expert model (MOE) and supports both "thinking" and "non-thinking" modes in the model architecture. It is speculated that its technical foundation may come from DeepSeek, and the optimization idea is similar to the unification of the two modes of Gemini 2.5 Flash, that is, aligning the two modes through reinforcement learning to achieve a better performance balance.
Dataset : Significantly expanded to 36 trillion tokens, twice that of Qwen2.5. The sources of data include unstructured documents extracted from large models, as well as specially constructed domain-specific data such as mathematics, programming, etc. Thanks to the expansion of the dataset, the QW3 model now supports 119 languages.
Pre-training : A phased processing strategy for the evaluation set tasks is adopted. First, language skills and general knowledge are trained, then knowledge-intensive data (such as STEM, programming, and reasoning), and finally high-quality long text data. This phased training approach may help the model better master different types of knowledge and skills.
Model products : Two types of models have been released, namely Dense model and MoE model. Dense model requires a larger memory (video memory), but the advantage is lower latency; while MoE model can run with smaller memory, but the reasoning calculation will be more time-consuming. Considering the recent popularity of intelligent agents, Qwen3 also supports Agent MCP capabilities.

Shrimp Comments:

From a technical perspective, QW3 does not have any breakthroughs, but rather optimizes the “alchemy” process. However, at the application level, the advantages of the open source MoE model in terms of resource usage make it possible to use it offline on a personal computer or edge device close to the capabilities of the current mainstream model. This has great potential for enterprise-level data-sensitive scenarios and offline applications on the terminal .

For example, the Qwen3-30B-A3B model can be deployed on machines with 16GB memory or 8GB video memory, and can be installed and used on mainstream personal computers. The Qwen3-235B-A22B model can be deployed on machines with 256GB memory + 24GB video memory. Ordinary individuals or enterprises only need to spend tens of thousands of yuan to purchase equipment that meets this configuration.

On the other hand, in some professional fields, such as mathematics, reasoning, and programming, it may be better to choose some customized models. DeepSeek-Prover-V2-671B released by DeepSeek before May Day is a customized large model for mathematical theorem proving.

For most ordinary users, of course, it is better to choose the full-blooded Qwen3-235B-A22B model service. It is always a good idea to have an additional option of using a large model at a very low cost.