Say goodbye to Google! Alibaba's open-source ZeroSearch large-scale model search costs drop by 88%, and its performance is even better than the original version?

Alibaba open-sources ZeroSearch, which innovates large-model search technology, reduces costs by 88%, and outperforms Google Search!
Core content:
1. ZeroSearch project overview: Innovative large-model search engine framework based on reinforcement learning
2. Main functions: No need for real search engine interaction, dynamic control of document quality, and significant cost reduction
3. Technical principles: simulated search engine, lightweight supervised fine-tuning, curriculum learning mechanism, reward signal based on F1 score
In today's digital age, large language models ( LLMs ) have made significant progress in the field of natural language processing, but in practical applications they still face problems such as generating hallucinatory content or outdated information. To solve this problem, retrieval-augmented generation ( RAG ) technology has emerged to improve the generation ability of the model by integrating external knowledge. However, traditional retrieval enhancement methods rely on interaction with real search engines, which is not only costly but also has the problem of uncontrollable document quality . The ZeroSearch project, open sourced by Alibaba Tongyi Lab , proposes an innovative solution that stimulates the search ability of large models by simulating search engines without interacting with real search engines, greatly reducing training costs while improving the reasoning ability of the model.
1. Project Overview
ZeroSearch is an innovative large-model search engine framework open sourced by Alibaba Tongyi Lab. It is based on reinforcement learning to stimulate the search capabilities of large models. It does not need to interact with real search engines. Through lightweight supervised fine-tuning and curriculum learning mechanisms, it converts large models into retrieval modules, which can generate relevant or noisy documents based on queries and dynamically control the generation quality. The performance of this framework exceeds that of Google Search on multiple question-answering datasets, while significantly reducing training costs (over 80% ), and it has strong scalability and versatility.
2. Main functions
1. No real search engine interaction required
ZeroSearch simulates search engines in a way that avoids interaction with real search engines (such as Google), thereby reducing costs and uncontrollability . In this way, the model can complete search tasks in a local environment without relying on external APIs .
2. Dynamically Control Document Quality
ZeroSearch supports generating relevant or noisy documents, and can flexibly control the quality of generated documents by adjusting the keywords in the prompt. This mechanism provides a variety of retrieval scenarios for training, which helps improve the robustness of the model.
3. Significantly reduce costs
Compared to using a real search engine for reinforcement learning training, ZeroSearch 's training cost is significantly reduced (over 80% ). This feature makes large-scale training more feasible, especially for researchers and developers with limited resources.
4. Support for multiple models and algorithms
ZeroSearch is compatible with large models of different parameter sizes (such as 3B , 7B , 14B ), and supports a variety of reinforcement learning algorithms (such as PPO , GRPO ). This flexibility enables the framework to adapt to different application scenarios and needs.
3. Technical Principle
1. Simulating search engines
ZeroSearch transforms the knowledge of the large model into a simulated search engine, generating relevant or noise documents based on the query, thereby replacing the real search engine . In this way, the model can complete the search task in the local environment without relying on external APIs .
2. Lightweight supervised fine-tuning
ZeroSearch fine-tunes the large model with a small amount of annotated data, enabling it to generate high-quality or low-quality documents to meet different training needs. This fine-tuning mechanism not only improves the model's retrieval ability, but also reduces training costs.
3. Course Learning Mechanism
During the training process, ZeroSearch gradually increases the noise level of the documents, allowing the model to start from simple scenarios and gradually adapt to more challenging tasks . Through this course learning mechanism, the model's reasoning ability has been significantly improved.
4. Reward mechanism based on F1 score
ZeroSearch uses the F1 score as a reward signal and focuses on the accuracy of the answer. This mechanism ensures that the answers generated by the model match the true answers as closely as possible, thereby improving the overall performance of the model.
(V) Multi-round interaction template
ZeroSearch designs clear reasoning, search, and answering stages, guiding the model to complete tasks step by step through structured tags (such as `<think>` , `<search>` , `<answer>` ). This multi-round interaction template not only improves the transparency of the model, but also enhances its reliability.
IV. Application Scenarios
1. Intelligent Question Answering System
ZeroSearch can answer user questions quickly and accurately, and is suitable for scenarios such as intelligent customer service and intelligent assistants . By simulating a search engine, the model is able to provide more accurate and reliable answers.
2. Content Creation
ZeroSearch can help creators obtain information, generate drafts or provide inspiration, and is suitable for fields such as news, copywriting and academic writing . By dynamically controlling the quality of documents, the model can provide creators with diverse information support.
3. Education and Learning
ZeroSearch can provide students with instant answers, support online education and intelligent tutoring . By simulating a search engine, the model can provide more accurate and reliable knowledge support.
4. Enterprise Knowledge Management
ZeroSearch can help employees quickly retrieve internal company resources and improve work efficiency . By dynamically controlling document quality, the model can provide enterprises with diversified knowledge management support.
(V) Research and Development
ZeroSearch can provide researchers with the latest research results and accelerate the research process . By simulating search engines, the model can provide more accurate and reliable research support.
5. Quick Use
1. Installation Dependencies
First, make sure you have the necessary dependencies installed in your system. Here are the installation steps:
conda create -n zerosearch python=3.9 # Create a new Conda environment conda activate zerosearch # Activate the environment pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121 # Install PyTorchpip install vllm==0.6.3 # Install vllmpip install wandb # Install Weights & Biases for experiment tracking pip install serpapi # Install SerpAPI for interacting with real search engines (optional)
2. Download the training dataset
Next, download the ZeroSearch training dataset. These datasets are used to train and fine-tune the model:
huggingface-cli download --repo-type dataset --resume-download sunhaonlp/ZeroSearch_dataset --local-dir ZeroSearch_dataset
This command will download the ZeroSearch dataset from the Hugging Face dataset repository and save it to the local `ZeroSearch_dataset` folder.
3. Download the simulated LLM
ZeroSearch uses a pre-trained simulated LLM to generate documents. You can choose a model with different parameter sizes according to your needs:
huggingface-cli download --resume-download sunhaonlp/SearchSimulation_3B --local-dir SearchSimulation_3Bhuggingface-cli download --resume-download sunhaonlp/SearchSimulation_7B --local-dir SearchSimulation_7Bhuggingface-cli download --resume-download sunhaonlp/SearchSimulation_14B --local-dir SearchSimulation_14B
These commands will download simulation LLMs of different parameter scales from the Hugging Face model repository and save them locally to the folders `SearchSimulation_3B` , `SearchSimulation_7B` , and `SearchSimulation_14B` .
(IV) Start the local simulation server
Start a local simulation server to use the simulated LLM during training :
python -m sglang.launch_server --model-path SearchSimulation_3B --host 0.0.0.0 --tp 2 --dp 2 --port 6001
This command starts a local server, listens on port 6001 , and uses the SearchSimulation_3B model as the simulation search engine. You can change the model path and port as needed.
5. Conduct reinforcement learning training
Finally, use the script provided by ZeroSearch for reinforcement learning training. The following is an example using the GRPO algorithm:
bash train_grpo.sh NUM_GPUS_PER_NODE 4 MODEL_PATH Llama-3.2-3B DATA_PATH ZeroSearch_dataset TOTAL_STEPS 203 IP localhost SEARCH_MODE simulate_sft SIMULATION_LLM SearchSimulation_3B START_THRESHOLD 0.25 END_THRESHOLD 0.5
This script starts the training process, using 4 GPUs , `Llama-3.2-3B` as the policy model, `ZeroSearch_dataset` as the training dataset, and a total of 203 steps . `SEARCH_MODE` is set to `simulate_sft` , indicating a simulated search engine with lightweight supervised fine-tuning, and `SIMULATION_LLM` is set to `SearchSimulation_3B` , indicating a simulated LLM with a 3B parameter size . `START_THRESHOLD` and `END_THRESHOLD` are set to 0.25 and 0.5 respectively to control the difficulty of the curriculum learning mechanism.
VI. Conclusion
ZeroSearch is an innovative large-model search engine framework open sourced by Alibaba Tongyi Lab. It stimulates the search capabilities of large models by simulating search engines, without the need to interact with real search engines, which greatly reduces the training cost and improves the reasoning ability of the model . Its performance on multiple question-answering datasets exceeds that of Google Search, and it has strong scalability and versatility. The open source of ZeroSearch has brought new possibilities to the field of natural language processing and provided researchers and developers with an efficient and flexible tool.