New member of Qwen3: Embedding series models are here!

Written by

Silas Grey

Updated on:June-13th-2025

Today, we officially released the Qwen3-Embedding series of models , a new member of the Qwen model family. This series of models is designed for text representation, retrieval, and sorting tasks, and is trained based on the Qwen3 basic model, fully inheriting Qwen3's advantages in multilingual text understanding capabilities.

Note: "MRL Support" indicates whether the Embedding model supports custom dimensions of the final vector. "Instruct Aware" indicates whether the Embedding or Reranker model supports customized input instructions based on different tasks.

In multiple benchmarks, the Qwen3-Embedding series has demonstrated excellent performance in text representation and sorting tasks.

Note:

We use the retrieval datasets from MTEB (eng, v2), MTEB (cmn, v1), MTEB (Multilingual) and MTEB (Code) for testing, denoted as MTEB-R, CMTEB-R, MMTEB-R, MTEB-Code respectively.

The ranking results are sorted based on the top-100 vector recall results of Qwen3-Embedding-0.6B.

Currently, this series of models has been open sourced on Hugging Face, ModelScope, and GitHub platforms. Users can also directly use the latest text vector model services provided by Alibaba Cloud Bailian Platform.

Open source address:

ModelScope:

https://modelscope.cn/collections/Qwen3-Embedding-3edc3762d50f48
https://modelscope.cn/collections/Qwen3-Reranker-6316e71b146c4f

Hugging Face :

https://huggingface.co/collections/Qwen/qwen3-embedding-6841b2055b99c44d9a4c371f
https://huggingface.co/collections/Qwen/qwen3-reranker-6841b22d0192d7ade9cdefea

GitHub:

https://github.com/QwenLM/Qwen3-Embedding

Technical Report:

https://github.com/QwenLM/Qwen3-Embedding/blob/main/qwen3_embedding_technical_report.pdf

Main Features

Excellent generalization: The Qwen3-Embedding series has reached the industry-leading level in multiple downstream task evaluations. Among them, the 8B parameter-scale Embedding model ranks first in the MTEB multilingual Leaderboard list (as of June 6, 2025, with a score of 70.58), and its performance exceeds that of many commercial API services. In addition, the ranking model of this series performs well in various text retrieval scenarios, significantly improving the relevance of search results.

Flexible model architecture: The Qwen3-Embedding series provides three model configurations ranging from 0.6B to 8B parameter scale to meet the performance and efficiency requirements in different scenarios. Developers can flexibly combine representation and sorting modules to achieve functional expansion.

In addition, the model supports the following customization features:

1) Customization of representation dimensions: Allows users to adjust the representation dimensions according to actual needs, effectively reducing application costs;

2) Instruction adaptation and optimization: Supports users to customize instruction templates to improve performance in specific tasks, languages or scenarios.

Comprehensive multi-language support: The Qwen3-Embedding series supports more than 100 languages, covering mainstream natural languages and multiple programming languages. This series of models has powerful multi-language, cross-language and code retrieval capabilities, and can effectively cope with data processing needs in multi-language scenarios.

Model Architecture

Based on the Qwen3 basic model, our Embedding model and Reranker model adopt a dual-tower structure and a single-tower structure design respectively. Through LoRA fine-tuning, we retain and inherit the text understanding ability of the basic model to the maximum extent.

The specific implementation is as follows:

1) The Embedding model receives a single paragraph of text as input and takes the hidden state vector corresponding to the "EOS" tag in the last layer of the model as the semantic representation of the input text;

2) The Reranker model receives text pairs (such as user queries and candidate documents) as input, uses a single-tower structure to calculate and output the relevance score of the two texts.

Model Training

The training of the Qwen3-Embedding series models inherits the multi-stage training paradigm of the GTE-Qwen series, but is deeply optimized for specific application scenarios.

In the training process of the Embedding model, we adopt a three-stage training architecture: the first stage is to conduct comparative learning pre-training through ultra-large-scale weakly supervised data; the second stage is to conduct supervised training based on high-quality labeled data; and finally, multiple candidate models are fused through the model fusion strategy to improve the overall performance. This staged training mechanism effectively balances the generalization ability and task adaptability of the model.

In the training of the Reranker model, based on the experimental verification results, we directly use high-quality annotated data for supervised training to improve training efficiency. It is particularly important to note that in the first phase of weakly supervised training of the Embedding model, we built a multi-task adaptive Prompt system. Using the text generation capabilities of the Qwen3 basic model, we dynamically generated a series of weakly supervised text pairs for different task types and language characteristics, breaking through the limitations of traditional methods that rely on community forums or open source data screening to obtain weakly supervised text pairs, and achieving efficient generation of large-scale weakly supervised data.

Future Development

The Qwen3-Embedding series of models is a new starting point. Relying on the continuous optimization of the Qwen basic model, we will continue to improve the training efficiency of text representation and sorting models to enhance the deployment performance of the models in actual scenarios.

In addition, we also plan to expand the multimodal representation system and build cross-modal semantic understanding capabilities. We expect more developers to explore a wider range of application scenarios based on the Qwen3-Embedding series and promote the in-depth application of the model in different business scenarios.