Qwen3 and Ollama compatibility issues

Written by

Caleb Hayes

Updated on:June-28th-2025

After the new model is released, it needs to be adapted to Ollama. For example, Gemma3 is supported only after version v0.6.0, and previous versions cannot run Gemma3. At the same time, the new version of ollama has requirements for the gblic version, and the new version of ollama cannot be run on versions such as CentOS 7.

The recommended operating environment is a new Linux distribution such as Ubuntu 22.04 or Ubuntu 24.04. They come with a higher version of glibc and better support for new models and Nvidia drivers.

Yesterday I tested running qwen3:32b on Ollama v0.6.0. Although I could download the model file, I couldn't run it. When running qwen3:32b, I got an error:

Error: unable to load model

The latest version of Ollama is v0.6.6, and the rc version is v0.6.7-rc0

When a new version is released, the supported new models will be announced, for example, v0.6.0 announced support for gemma3, and v0.6.6 announced support for DeepCoder.

Currently, the v0.6.7-rc version has not yet announced support for qwen3. Through testing, it was found that qwen3:32b can be run under v0.6.7, but there are stability issues. The qwen3 model will be automatically unloaded at the end of each session, and the model will be reloaded at the beginning of a new session. On the application side, the response will be slow, and each question will be stuck for a while.

It is recommended to wait until the new ollama version officially announces support for qwen3 before deploying. The current stability is insufficient to use qwen3 normally.

The file sizes of the qwen3 models of various sizes are:

For local deployment, you can consider qwen3:30b-a3b and qwen3:32b. They are similar in size but different in model. qwen3:32b is a dense model and qwen3:32b-a3b is a mixed expert (Moe) model. The most powerful model of qwen3 is qwen3:235b-a22b with 142G, which is a Moe model like qwen3:32b-a3b.

If it is a 4-card v100 or 3060 graphics card, you can consider deploying qwen3:32b and qwen3:30b-a3b. If it is an 8-card A800, you can consider qwen3:235b-a22b.

Here is the A800 machine that I just installed yesterday:

The full text is over.