Official announcement, Milvus SDK v2 released! Native asynchronous interface, MCP support, performance improvement

Written by
Jasper Cole
Updated on:July-08th-2025
Recommendation

Milvus SDK v2 is newly upgraded to bring a more efficient and easy-to-use AI development experience!

Core content:
1. Milvus SDK v2 solves the problems of missing asynchronous interfaces and performance bottlenecks in v1 version
2. Unified interface design to improve usability and cross-language collaboration efficiency
3. Detailed interpretation of Python SDK v2, support for new features such as MCP

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Preface

“Milvus is good, but the interface and functions are a bit complicated!

When using Milvus, why do I always have to jump back and forth between the SDK documentation and Stack Overflow?

After much difficulty in adjusting the interface, we encountered performance bottlenecks and asynchronous call problems!

Isn't there a simpler, more unified, more efficient SDK?"

We have received all the encouragement and suggestions you have given in the community.

Today, we officially announced that Milvus SDK v2 is officially launched!

Focusing on developer experience, Milvus SDK v2 completely solves the problems of complex interfaces, incomplete documentation, and missing functions in the past with a unified interface design, flexible asynchronous mode, and significantly improved usability.

Summary in one sentence: Milvus SDK v2 uses a simpler and easier-to-use user interface, providing higher guarantees for the implementation of AI!

The following is an explanation of the specific functions:

01

Pain point analysis: Why is change needed?

Milvus is a vector database that is popular among developers and is widely used in AI application development. However, Milvus v1 SDK also has many inconveniences. Here are some pain points:

Pain point 1: Lack of asynchronous interfaces makes it difficult to cope with high concurrency challenges

In the v1 era, some SDKs (such as pymilvus) lacked native asynchronous support, and developers had to rely on threads or callbacks to implement concurrent operations. In the face of high-concurrency scenarios such as batch data loading and parallel query, this approach not only makes the code bloated, but also greatly increases the difficulty of debugging, which seriously affects the system's throughput and response speed.

Pain point 2: Performance bottleneck: Lack of Schema  Cache  leads to inefficient insertion and query

In addition to the asynchronous problem, the v1 SDK also has some shortcomings in performance. Due to the lack of optimization mechanisms such as Schema Cache, frequent schema verification and parsing operations during data insertion and querying make it difficult to effectively improve the overall performance. These problems are particularly prominent in the scenario of massive data, which directly restricts the real-time and scalability of the application.

Pain point 3: Insufficient usability: cumbersome operation and inconsistent interface logic

Taking pymilvus as an example, the v1 version adopts a mixed ORM and procedural programming mode, with classes for common objects (such as Collection), and procedural functions for complex logic processing. This mixed mode confuses novices and is inconsistent with other languages. In addition, the naming and calling methods of SDKs in different languages ​​are different, which brings additional learning burden to developers and makes cross-language team collaboration and code maintenance more complicated.

Pain point 4: Inconsistent interface functions and different degrees of completion

In addition, due to the late start of the http interface, the early RESTful API functions were very limited, and many operations (such as partition management, index building, etc.) could only be implemented through other SDKs, resulting in limited interface functions and poor experience. In addition to Python, SDKs in other languages ​​also have similar functional missing problems and cannot meet the functional requirements of some users of specific languages.

02

Solution: Milvus SDK v2

In response to the above pain points, we have comprehensively improved Milvus SDK, involving multiple programming languages. The final versions include:

  • Python SDK v2 (pymilvus.MilvusClient) (https://milvus.io/api-reference/pymilvus/v2.5.x/About.md)

  • Java v2 (https://github.com/milvus-io/milvus-sdk-java)

  • Go v2 (https://github.com/milvus-io/milvus/tree/client/v2.5.1/client)

  • NodeJS (https://github.com/milvus-io/milvus-sdk-node)

  • Restful v2 (https://milvus.io/api-reference/restful/v2.5.x/About.md)

The new SDK brings the following benefits:

(1) Native asynchronous interface: fully embrace the era of high concurrency

To meet the needs of high-concurrency scenarios, Milvus SDK v2 introduces native asynchronous call support (https://milvus.io/docs/use-async-milvus-client-with-asyncio.md).

Taking Python SDK as an example, v2.5.3 provides AsyncMilvusClient, based on asyncio to achieve true async/await asynchronous calls. Its interface has the same parameters and behaviors as the synchronous MilvusClient , the only difference is the calling method.

Through AsyncMilvusClient, developers can easily execute multiple Milvus operations (insert, query, search, etc.) in parallel and make full use of asynchronous IO to achieve high throughput.

Compared with the v1 solution that relies on Future or callback, the new version of the native asynchronous mode is more concise and efficient. Some previously complex concurrent logic, such as batch vector insertion and parallel multi-query, can now be done with asyncio.gather Waiting can be easily achieved.

The addition of native async/await enables Python to fully utilize its concurrency capabilities, which significantly improves the performance of scenarios such as batch data loading and parallel query. It is also easy to integrate into Python backend asynchronous frameworks (such as aiohttp, FastAPI, etc.).

(2) Performance improvement: Schema  Cache  helps efficient data processing

The new version of Milvus SDK has achieved a significant breakthrough in performance optimization by introducing the Schema  Cache  mechanism .

After obtaining the collection schema for the first time, this mechanism will cache it in the local memory, so that the cached information can be directly used in subsequent data insertion and query operations, avoiding additional network delays and CPU resource waste caused by repeated requests and parsing.

Especially in scenarios with high real-time requirements such as batch data writing and high-frequency queries, this optimization significantly improves the system's response speed and throughput, while also greatly reducing the server load pressure.

By reducing the overhead of repeated schema parsing, the new SDK not only improves data processing efficiency, but also provides a solid technical guarantee for developers to build high-performance applications.

(3) Interface functions are more unified and complete

One of the most significant improvements of Milvus SDK v2 is the unification of interfaces: SDKs in various languages ​​provide more unified and complete API  methods, especially  the RESTful API  has been greatly enhanced .

Previously, because the HTTP interface started later than gRPC, there were some functional gaps, but now the RESTful API has made up for these gaps. For example, developers can now easily complete almost all operations such as collection creation, partition management, index construction, and data query through the RESTful interface without having to switch to other interface forms.

This unified interface design makes Milvus's operating experience more consistent in different scenarios, which not only reduces the learning cost of developers, but also significantly improves the usability of the product.

If you want to quickly get started with Milvus, we recommend the easier-to-use RESTful API. However, if you have higher requirements for performance or advanced features (such as iterators), we recommend the gRPC-based client for better performance and richer feature support.

(4) The logic of each language  SDK  is aligned, and the consistency is higher

Milvus SDK v2 refactored and aligned the clients of various languages ​​to make their interface naming and calling methods highly consistent .

Now, whether it is Python, Java, Go, or NodeJS, each SDK has introduced a MilvusClient The main class provides similar interface methods  ( code example changed from PyMilvus ORM to the MilvusClient SDK · milvus-io milvus · Discussion #33979 · GitHub ).

The purpose of this change is to make all SDKs behave in the same way and avoid the confusion caused by differences in usage in different languages ​​in previous versions. For example, some operations may have different function names and parameter formats in different languages ​​before, but now they are all standardized. This alignment makes cross-language development smoother - after becoming familiar with Milvus SDK in one language, you almost don't need to relearn when using SDKs in other languages, and you will no longer be confused by naming differences.

(5) PyMilvus from ORM arrive MilvusClient: Improved usability

The evolution of the Python SDK fully reflects the improvements of Milvus SDK v2.

The old version of PyMilvus uses  the ORM  module, which provides object-oriented classes such as Collection, Index, Partition, and independent connection functions. This ORM mode has the problem of mixing object-oriented and process-oriented : developers need to define the schema object first, and then instantiate the collection. The operation is cumbersome and the threshold for conceptual understanding is high.

Compared with the past, when creating a collection in ORM mode, you need to manually define the FieldSchema list, construct the CollectionSchema, and then call the Collection class to create the collection. Now you can directly use MilvusClient.create_collection() Create a collection in one step .

This method supports directly passing in parameters such as dimension, metric type metric_type, etc. to quickly define the schema, or passing in a custom schema object.

More importantly, it can accept index parameters and automatically build indexes for vector fields . It can also build indexes and load data into memory while creating a collection . In other words, a single call is equivalent to completing the three steps of "create collection -> create index -> load collection" (if index parameters are provided, the collection will be automatically loaded after creation without explicit calls). This greatly reduces the startup process dependency and is convenient for out-of-the-box use.

With the above improvements, the upgraded PyMilvus milvusclient The module  has significant advantages over the old ORM in terms of ease of use, consistency and performance . Although the old ORM interface is still available, we will consider gradually abandoning it in the future ( ORM | Python | ORM | Zilliz Cloud Developer Hub ) and fully switch to MilvusClient. Therefore, it is strongly recommended that you upgrade as soon as possible to enjoy the convenience brought by the new SDK.

(6) Clearer and more complete documentation

We have fully optimized and restructured the product documentation, launched a more complete and clearer API Reference, and added sample codes with multi-language support in the User Guide to help you quickly get started and deeply understand the various functions of Milvus. In addition, we highly recommend that you use the Ask AI assistant provided by the Documentation Station, which can help you introduce new functions, understand the internal mechanism of Milvus, generate and modify sample codes, making it easier and more enjoyable to consult documents and explore functions.

 (7) MCP server built based on Milvus  SDK

The MCP Server ( GitHub link ) built on Milvus SDK adopts the Model Context Protocol (MCP) to achieve seamless integration between LLM applications and external data sources and tools.

As AI Agents become more popular, in the future, it will not only be possible to automatically generate code through natural language, but also be possible to design AI-oriented APIs to make the calling and scheduling of backend services more intelligent and automated. MCP Server built on Milvus SDK was born in this context: it not only realizes the operation and management of Milvus clusters, but also provides a unified and open interface for automated operation and maintenance, intelligent scheduling, and cross-system interaction.

In this way, not only can developers easily manage Milvus clusters, but future AI agents can also directly use these APIs to automatically generate code and perform complex tasks, achieving seamless collaboration between humans and machines, and between machines.

03

Sample Code

The following simple code snippets demonstrate how to use the new interface of Python SDK v2 to complete collection creation and asynchronous operations. Compared with the ORM mode of v1, the code is more concise and unified.

(1) Use MilvusClient Create collection, schema, index and load:

from pymilvus import MilvusClient, DataType# 1. Connect to Milvus (initialize the client, that is, establish a connection)client = MilvusClient(uri="http://localhost:19530")# 2. Define the schema of the collectionschema = MilvusClient.create_schema(auto_id=False, description="Schema of the sample collection")schema.add_field("id", DataType.INT64, is_primary=True) # Primary key fieldschema.add_field("embedding", DataType.FLOAT_VECTOR, dim=128) # Vector field# 3. Prepare index parameters (optional step, if you need to create an index at creation time)index_params = client.prepare_index_params()index_params.add_index( field_name="embedding", index_type="AUTOINDEX", metric_type="L2")# 4. Create a collection with an index and automatically load it into memoryclient.create_collection(    collection_name="example_collection", schema=schema, index_params=index_params)print("Collection created and loaded with index!")

The above code completes the definition, creation and indexing of the collection in one call.

The index_params parameter provided by create_collection eliminates the need to call separate create_index and load_collection. After the collection is created, the index is automatically created and loaded into memory. This is also the processing logic advocated by Milvus Client: use one interface to complete various operations required for table creation.

In addition, MilvusClient also supports fast table creation mode, which further improves ease of use: you only need to fill in the required parameters to complete table creation with one line of code.
client.create_collection( collection_name="test_collection", dimension=128)

( Comparison note : In the old version of ORM, we need to call Collection(schema) Create a collection object and then call collection.create_index() Create the index and finally collection.load() Load the dataset; now use MilvusClient to do it in one step. )

(2) Use AsyncMilvusClient Perform high- concurrency asynchronous operations:

import asynciofrom pymilvus import AsyncMilvusClientasync def insert_vectors_concurrently(): client = AsyncMilvusClient(uri="http://localhost:19530") vectors_to_insert = [[...], [...], ...] # Assume there are 100,000 vectors batch_size = 1000 # Recommended batch size tasks = [] for i in range(0, len(vectors_to_insert), batch_size): batch_vectors = vectors_to_insert[i:i+batch_size] # Batch construct data data = [ list(range(i, i + len(batch_vectors))), # Batch id batch_vectors # Batch vector ] # Add asynchronous tasks and insert a batch of data each time tasks.append(client.insert("example_collection", data=data)) # Concurrent batch insert insert_results = await asyncio.gather(*tasks) await client.close()# Execute asynchronous tasks asyncio.run(insert_vectors_concurrently())

The above code uses AsyncMilvusClient Example by async/await Syntax to perform insert operations concurrently. We created multiple insert tasks and used asyncio.gather Schedule them at the same time, thus making full use of the concurrent processing capabilities of the Milvus backend. Compared with synchronous insertion, asynchronous concurrent insertion can greatly improve throughput. There is no such native async support in Python SDK v1, and Milvus SDK v2 makes full use of Python's asynchronous features.

Similarly, you can use the asynchronous client to perform concurrent queries or searches. For example, change the insert in the above code to client.search("example_collection", data=[query_vec], limit=5), you can initiate multiple search requests at the same time. The asynchronous interface of Milvus SDK v2 ensures that each request is executed in a non-blocking manner, maximizing the utilization of client and server resources.

04

Summarize

Milvus SDK v2 brings significant improvements compared to v1: more efficient performance, more unified interfaces, more consistent cross-language support, and easier use .

By unifying the languages MilvusClient With the new API, developers can enjoy a consistent development experience in any language. With native asynchronous support, Milvus improves performance in high-concurrency scenarios. The new design, taking the Python SDK as an example, solves many shortcomings of the old ORM model. At the same time, the new version of the documentation and the intuitive and concise UI design allow developers to quickly get started and build applications efficiently. We firmly believe that as Milvus continues to improve and develop on various language platforms, it will further promote the transformation of unstructured data processing and AI integration.

We strongly recommend that users who are still using SDK v1 upgrade to Milvus SDK v2 as soon as possible. Support for v1 is scheduled to end in Milvus 3.0. Upgrading is not difficult - the Milvus team provides backward compatibility support in version 2.x, and the old v1/ORM interface will still be available for a period of time. You can refer to the new version of the document to improve your code while gradually deprecating the old interface. The official documentation and community resources provide detailed guides and examples to help you migrate smoothly. While enjoying the convenience brought by Milvus SDK v2, you will also get more active support from the community and new features in subsequent versions.

For more detailed information, please visit our official latest version of API Reference. At the same time, our User Guides are also being updated and improved.