Bocha officially releases semantic ranking model (bocha-semantic-reranker)

Explore the Bocha semantic reranker model to improve the accuracy of search results.
Core content:
1. The release of the Bocha semantic reranker model and its API
2. The working principle of the model: secondary sorting optimization based on text semantics
3. The necessity of the semantic reranker model and its impact on RAG applications
1. Overview
Bocha officially released the semantic ranking model (bocha-semantic-reranker) and ranking API (Rerank API). Bocha Semantic Reranker can be used to improve the accuracy of search results in search applications and RAG applications.
2. What is the Bocha Semantic Reranker?
Bocha Semantic Reranker is a text semantics-based ranking model (Rerank Model), and its main purpose is to improve the quality of search results. In the search recommendation system, Bocha Semantic Reranker can optimize the quality of preliminary ranking results based on keyword search, vector search, and hybrid search. Specifically, after the initial BM25 ranking or RRF ranking, Bocha Semantic Reranker will use semantic information to perform secondary ranking of documents from the top-N candidate results. In this process, the model will give the ranking results and scores of each document based on the deep semantic match between the query statement and the document content, thereby improving the user's search experience. Since this method is a secondary optimization of the preliminary ranking results, it is called "Reranker".
3. Why do we need a semantic ranking model?
The need for semantic ranking stems from the limitations of traditional retrieval methods (such as BM25) when processing complex queries. Traditional retrieval methods mainly rely on keyword matching, ignoring the deep semantics of the text. For example, two words may not be the same on the surface, but they may be very close in semantics, and traditional keyword matching cannot capture this semantic similarity. With the rise of RAG (Retrieval-Augmented Generation) applications, ranking methods that rely solely on keyword matching often fail to provide the most relevant results, and the need for semantic ranking has become more prominent. RAG applications combine retrieval and generation tasks by retrieving relevant documents from a large-scale document library and generating answers based on these documents. Therefore, how to effectively evaluate the semantic relevance of the retrieved documents to the query directly affects the quality of the generated answers. If the retrieved documents do not match the query intent, the generated answers may be inaccurate or irrelevant, which requires the introduction of a more accurate semantic ranking mechanism in the retrieval stage.
Semantic ranking introduces deep learning and natural language processing technologies, allowing the search system to sort search results based on the true intent of the query, rather than just the superficial keywords. This approach can more accurately understand the context of the query and user needs, thereby improving the relevance of search results and significantly improving the user experience. In particular, in RAG applications, semantic ranking helps ensure that the retrieved documents have sufficient semantic matching, thereby providing more valuable input for subsequent generation tasks and improving the overall question-answering effect.
4. How to score the Bocha semantic ranking model?
The scoring process of the Bocha semantic ranking model is based on the query statement (the user's input question) and the matching document content (usually a text with a maximum of 512 tokens). The scoring process is as follows:
Evaluate semantic relevance : The model evaluates the semantic relevance of the query statement to each document to determine whether the document can effectively answer the user's query or highly matches the query intent.
Assigning Rerank Score : Based on semantic relevance, the model assigns a rerankScore to each document, ranging from 0 to 1. The higher the score, the stronger the semantic relevance of the document to the query and the more it meets user needs. Generally, a score close to 1 indicates high relevance, and a score close to 0 indicates irrelevance or low relevance.
Score | Meaning |
0.75~1 | The document is highly relevant and fully answers the question, although may contain extra text not relevant to the question. |
0.5~0.75 | The document is relevant to the question but lacks the details to make it complete. |
0.2~0.5 | The document is somewhat relevant to the question; it partially answers the question, or only addresses some aspects of the question. |
0.1~0.2 | This document is relevant to the question, but only answers a small part. |
0~0.1 | The document is irrelevant to the question. |
5. Bocha Semantic Reranker
Bocha Semantic Reranker is based on the Transformer architecture and uses 80M parameters to achieve ranking results close to those of the world's top 280M and 560M parameter models. It has faster inference speed, lower cost, and higher cost-effectiveness.
6. How to use the Bocha Semantic Reranker API?
How to use
Register a Bocha developer account: Visit the Bocha AI open platform, scan the QR code to log in via WeChat, and create a new account. After logging in, you can see that there are 4 types of APIs: Web Search API, AI Search API, Agent Search API, and Semantic Reranker API.
Get API KEY: In the upper right corner of the homepage or in the left menu, you can see "API KEY Management". Click it to create a new API KEY. Please save it because you will need it when calling the Bocha Sort API.
View the list of supported ranking models: Currently, three models are supported: bocha-semantic-reranker-cn, bocha-semantic-reranker-en, and gte-rerank. The first two models require invitation to use, while the gte-rerank model can be used directly.
Call the Bocha Sort API: In your application, you can use the following code to call the Bocha Sort API:
import requests
import json
url = "https://api.bochaai.com/v1/rerank"
payload = json.dumps({
"model" : "gte-rerank" ,
"query" : "Alibaba 2024 ESG Report" ,
"top_n" : 2,
"return_documents" : true,
"documents" : [
"Alibaba Group released the 2024 Fiscal Year Environmental, Social and Governance (ESG) Report (hereinafter referred to as the "Report"), sharing in detail the progress made in various aspects of ESG in the past year. The report shows that Alibaba has made solid progress in carbon reduction initiatives, and the Group's own net carbon emissions from operations and carbon intensity of the value chain continue to achieve "double reductions". The Group also continues to use digital technology and platform capabilities to serve inclusive development such as accessibility, medical care, aging-friendly and small and medium-sized enterprises. Alibaba Group CEO Wu Yongming said in the report: "The core of ESG is about how to become a better company. For 25 years, our ESG-related actions have formed the company's background, which is as important as Alibaba in creating business value. While the Group has clarified the two major business strategies of 'user first' and 'AI-driven', we have also made it clear that ESG remains one of Alibaba's cornerstone strategies. Alibaba has made solid progress in reducing carbon emissions. " ,
"The core of ESG is about how to become a better company. This year marks the 25th anniversary of Alibaba's founding. Over the past 25 years, Alibaba has adhered to the principle of "making business easy for everyone" and assisted the prosperity and development of domestic e-commerce. It has adhered to an open ecosystem, and the Moda community has opened more than 3,800 open source models. It has helped rural revitalization and has sent a total of 29 rural commissioners to 27 counties. It has promoted carbon reduction on the platform and pioneered the Scope 3+ carbon reduction plan. It has insisted on public welfare for all employees and used "3 hours for everyone" to bring small and beautiful changes... The company's background formed by these actions is as important as Alibaba, which creates commercial value. I hope that in this process, every Alibaba person can learn to make difficult but correct choices, stay forward-looking, maintain goodwill, and remain pragmatic. A better Alibaba is worth our joint efforts. Alibaba's mission for more than 20 years has been to make business easy for everyone. Today, this mission has been given new significance in the new era."
]
})
headers = {
'Authorization': 'Bearer YOUR-API-KEY',
'Content-Type': 'application/json'
}
response = requests.request( "POST" , url, headers=headers, data=payload)
print(response.text)
Example response:
{"code": 200,"log_id": "56a3067f9b92dfd0","msg": null,"data": {"model": "gte-rerank","results": [{"index": 0,"document": {"text": "Alibaba Group released the "2024 Fiscal Year Environmental, Social and Governance (ESG) Report" (hereinafter referred to as the "Report"), sharing in detail the progress made in various aspects of ESG in the past year. The report shows that Alibaba has made solid progress in carbon reduction initiatives, and the group's own net carbon emissions from operations and carbon intensity of the value chain continue to achieve "double reductions". The group also continues to use digital technology and platform capabilities to serve inclusive development such as accessibility, medical care, aging-friendly and small and medium-sized enterprises. In the report, Alibaba Group CEO Wu Yongming said: "The core of ESG is about how to become a better company. For 25 years, the company's background formed by our ESG-related actions is as important as Alibaba in creating business value. While the Group has clearly defined the two major business strategies of "user first" and "AI driven", we have also made it clear that ESG remains one of Alibaba's cornerstone strategies. Alibaba has made solid progress in reducing carbon emissions. "},"relevance_score": 0.7166407801262326},{"index": 1,"document": {"text": "The core of ESG is about how to become a better company. This year marks the 25th anniversary of Alibaba. Over the past 25 years, Alibaba has adhered to the principle of "making business easy for everyone" and assisted the prosperity and development of domestic e-commerce; insisted on an open ecosystem, and the Moda community has opened more than 3,800 open source models; helped rural revitalization, and sent a total of 29 rural commissioners to 27 counties; promoted platform carbon reduction, and pioneered the scope 3+ carbon reduction plan; insisted on public welfare for all employees, and used "3 hours for everyone" to bring small and beautiful changes... The company's background formed by these actions is as important as Alibaba in creating commercial value. I hope that in this process, every Alibaba employee can learn to make difficult but correct choices, stay forward-looking, maintain goodwill, and remain pragmatic. A better Alibaba is worth our joint efforts. Alibaba's mission for more than 20 years has been to make it easy to do business anywhere. Today, this mission has been given new significance in the new era. "},"relevance_score": 0.5658672473649548}]}}
Interface URL
https://api.bochaai.com/v1/rerank
Request method
POST
Request Parameters
Request Header
parameter | Value | illustrate |
Authorization | Bearer {API KEY} | Authentication parameters, example: Bearer xxxxxx, API KEY, please go to Bocha AI Open Platform (https://open.bochaai.com) > API KEY Management to obtain it. |
Content-Type | application/json | How to interpret the request body. |
Request Body
parameter | type | Required | illustrate |
model | String | yes | The model version used for sorting. Current version model:
|
query | String | yes | The user's search term. It can be natural language, for example: Tell me the key points of Alibaba's 2024 ESG report |
documents | Array<String> | yes | An array of documents to be sorted. Maximum number of documents is 50. |
top_n | Integer | no | The number of top documents returned by sorting. The default is the same as the number of documents. |
return_documents | Boolean | no | Whether to return the original text of each document in the sorted result list. Default: False |
Response Definition
parameter | type | illustrate |
code | Integer | Status code. 200 means the call was successful. |
log_id | String | Request id. |
msg | String | Status information. |
data | Object | The returned result. |
data.model | String | The model to use for sorting. |
data.results | Array | Sort the results. Example: [ { "index": 0, "document": { "text": "Alibaba Group released the "2024 Fiscal Year Environmental, Social and Governance (ESG) Report" (hereinafter referred to as the "Report"), sharing in detail the progress made in various aspects of ESG in the past year. The report shows that Alibaba has made solid progress in carbon reduction initiatives, and the group's own net carbon emissions from operations and carbon intensity of the value chain continue to achieve "double reductions". The group also continues to use digital technology and platform capabilities to serve inclusive development such as accessibility, medical care, aging-friendly and small and medium-sized enterprises. Alibaba Group CEO Wu Yongming said in the report: "The core of ESG is about how to become a better company. For 25 years, our ESG-related actions have formed the company's background, which is as important as Alibaba in creating business value. While the group has clarified the two major business strategies of 'user first' and 'AI-driven', we have also made it clear that ESG remains one of Alibaba's cornerstone strategies. Alibaba has made solid progress in reducing carbon emissions. " }, "relevance_score": 0.7166407801262326 }, { "index": 1, "document": { "text": "The core of ESG is about how to become a better company. This year marks the 25th anniversary of Alibaba's founding. Over the past 25 years, Alibaba has adhered to the principle of "making business easy for everyone" and assisted the prosperity and development of domestic e-commerce; insisted on an open ecosystem, and the Moda community has opened more than 3,800 open source models; helped rural revitalization, and sent a total of 29 rural commissioners to 27 counties; promoted platform carbon reduction, and pioneered the scope 3+ carbon reduction plan; insisted on public welfare for all employees, and used "3 hours for everyone" to bring small and beautiful changes... The company's background formed by these actions is as important as Alibaba, which creates commercial value. I hope that in this process, every Alibaba person can learn to make difficult but correct choices, stay forward-looking, maintain goodwill, and remain pragmatic. A better Alibaba is worth our joint efforts. Alibaba's mission for more than 20 years has been to make business easy for everyone. Today, this mission has been given new significance in the new era." }, "relevance_score": 0.5658672473649548 }, ··· ] |