Qwen3 releases: 4B kills the old 72B / Windsurf adds a new free plan

Written by

Iris Vance

Updated on:June-29th-2025

Qwen3 wins big with small investment

—

Qwen3 obviously realized that he had been the "Wang Feng of AI" for too long, and chose the most dull and least noticed time to release Qwen3. (Of course, this statement is only for today, only for now, maybe DeepSeek R2 will be released today?)

The biggest highlight of Qwen3 this time is: small wins big , Qwen3-4B directly kills Qwen2.5-72B-Instruct. This is not my exaggeration, the official exaggeration is as follows:

It feels like Qwen3 is ready to change tracks.

First, Qwen (Wang Feng) was beaten by his peers as soon as he released a new product; second, to be honest, I personally can't find any reason to use it , and the frequency of use in the past few months has been close to 0. I used it with the local configuration Ollama before, but I haven't used it since Google Gemma was released.

Secondly, it is rare for a large model team to come up with such a small-scale model. Now that Qwen3 is pushing this small model, it is very likely to go the hardware embedded route. With the small size of 4B, it can run easily on all kinds of low-end hardware.

The following is a summary of the Qwen3 model release:

Model List

MoE Architecture Model

Qwen3-235B-A22B, the top model should not be open source.

Total parameters: 235B
Activation parameters: 22B
Context length: 128K
Qwen3-30B-A3B

Total parameters: 30B
Activation parameters: 3B
Context length: 128K

Non-MoE Architecture Model

The key new feature, hybrid thinking mode, can be turned on and off manually. Qwen3 adds a thinking manual switch to support users to explicitly turn on/off the thinking chain.

Although this function may seem inconspicuous, I personally think it is the most important one, because there is really no need to use thinking chain in all scenarios .

Just like I complained about WeChat search adding R1 before, asking about the weather requires deep thinking for a few seconds. In fact, it is just a matter of requesting an interface to get the correct answer. It is purely a hard addition for the purpose of gaining traffic.

In addition, we have specifically enhanced the coding and agent task performance for MCP and optimized support for MCP (multi-round collaborative planning).

Official Demo is now online: http://chat.qwen.ai

There is no point in talking about evaluation rankings and indicators.

For specific actual experience reports, please pay attention to the follow-up push of WeChat public account articles.

Windsurf new logo and new free plan

—

But to be honest, I personally feel that this LOGO is not as eye-catching and beautiful as the old one.

In the past few weeks, Windsurf has also simplified its pricing system and opened up free access to cutting-edge models such as GPT-4.1 and o4-mini to all users (reply Windsurf in the WeChat public account background to view).

Today, they’re making another breakthrough – with a comprehensive upgrade to the free plan.

Free users now enjoy 25 advanced model calls (originally 5 times) , combined with the special price of 0.25 points for GPT-4.1/o4-mini, which is equivalent to 100 calls per month.

The biggest highlight: unlimited use of Cascade Base models , complete intelligent agent experience under Write. This also applies to VSCode and JB series.

Unlimited and fast Tab completion, including paid features such as auto-completion/super-completion/Tab jump

In addition, application deployment permissions have been added: 1 full deployment per day + unlimited previews.