Description of three deployment methods of milvus

Written by
Clara Bennett
Updated on:June-26th-2025
Recommendation

A complete analysis of Milvus vector database deployment, covering all requirements from lightweight to large-scale production environments.

Core content:
1. Three deployment modes: the characteristics and applicable scenarios of Milvus Lite, Standalone, and Distributed
2. How to choose the appropriate Milvus deployment mode according to the project stage and scale
3. Integration method of Milvus Lite and Standalone and data migration strategy

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Milvus vector database has three deployment modes: Milvus Lite, Milvus Standalone, and Milvus Distributed, to support different usage scenarios and data scales. Different deployment modes provide the same client application interface.

1. Brief description

Milvus Lite

Milvus Lite It is a Python library of milvus that can be imported into existing applications for integration. As a lightweight version of Milvus, it is very suitable for running tests and development on Jupyter notebooks or smart devices with limited resources.

To integrate Milvus Lite into your application, runpip install pymilvus Install it and useMilvusClient("./demo.db") The statement instantiates a vector database with a local file to persist all data.

Milvus standalone version

Milvus Standalone It is a single-machine server deployment. All components of Milvus Standalone are packaged into a Docker image, which is very convenient to deploy. If you have production workloads but do not want to use Kubernetes, running Milvus Standalone on a single machine with sufficient memory is a good choice. In addition, Milvus Standalone supports high availability through master-slave replication.

Distributed Milvus

Milvus Distributed Deployable on a Kubernetes cluster. This deployment uses a cloud-native architecture, where ingestion loads and search queries are handled by independent nodes, allowing redundancy of key components. It has the highest scalability and availability, and can flexibly customize the resources allocated in each component.Milvus Distributed It is the first choice for enterprise users running large-scale vector search systems in production.

2. How to choose the deployment method that suits you

The Milvus deployment mode can be selected based on the project stage and scale.

Project Phases

Choose based on the development stage of your application:

1. For rapid prototyping

If you need to build prototypes quickly or for learning, such as Retrieval Augmentation Generation (RAG) demos, AI chatbots, and multimodal search, Milvus Lite itself or the combination of Milvus Lite and Milvus Standalone is a good choice. You can use Milvus Lite in a notebook for rapid prototyping and explore various methods, such as different chunking strategies in RAG. Or you may want to deploy applications built with Milvus Lite in small-scale production, serve real users, or verify ideas on larger datasets (e.g., more than a few million vectors). Milvus Standalone is a suitable choice. The application logic of Milvus Lite can still be shared because all Milvus deployments have the same client application interface. Data stored in Milvus Lite can also be ported to Milvus Standalone through command-line tools.

2. Small-scale production deployment

For the early production stage, when the project is still seeking product-market fit and agility is more important than scalability , Milvus Standalone is the best choice. It can still scale to 100 million vectors as long as there are enough machine resources, and the DevOps requirements are much lower than maintaining a K8s cluster.

3. Large-scale production deployment

When your business grows rapidly and the data scale exceeds the capacity of a single server, it is time to consider Milvus Distributed. You can continue to use Milvus Standalone as a development or staging environment and operate a K8s cluster running Milvus Distributed. This can support you to process tens of billions of vectors and flexibly adjust the node size according to your specific workload (such as high read, low write or high write, low read).

4. Local Search on Edge Devices

For searching through private or sensitive information on edge devices, Milvus Lite can be deployed on the device without relying on cloud-based services for text or image search. This is suitable for cases such as proprietary document search or on-device object detection.

Project dataset size

Choose a deployment method that suits you based on the size of your project dataset.

  • Milvus Lite is recommended for smaller datasets, up to a few million vectors.
  • Milvus Standalone is suitable for medium-sized datasets and can be expanded to 100 million vectors.
  • Milvus Distributed is designed for large-scale deployment and can handle datasets ranging from 100 million to tens of billions of vectors.

3. Functional comparison

The three deployment modes are consistent in terms of data types and search functions. The stand-alone version and distributed deployment provide more client support, more data management functions, etc.

Function
Milvus Lite
Milvus standalone version
Distributed Milvus
SDK / Client Software
Python
gRPC
Python
 Go
 Java
 Node.js
 C#
 RESTful
Python
 Java
 Go
 Node.js
 C#
 RESTful
Data Types
Dense Vector
 Sparse Vector
 Binary Vector
 Boolean
 Integer
 Floating Point
 VarChar
 Array JSON
Dense Vector
 Sparse Vector
 Binary Vector
 Boolean
 Integer
 Floating Point
 VarChar
 Array JSON
Dense Vector
 Sparse Vector
 Binary Vector
 Boolean
 Integer
 Floating Point
 VarChar
 Array JSON
Search Function
Vector search (ANN search)
 Metadata filtering
 Range search
 Scalar query
 Get entity by primary key
 Hybrid search
Vector search (ANN search)
 Metadata filtering
 Range search
 Scalar query
 Get entity by primary key
 Hybrid search
Vector search (ANN search)
 Metadata filtering
 Range search
 Scalar query
 Get entity by primary key
 Hybrid search
CRUD Operators
✔️
✔️
✔️
Advanced Data Management
not applicable
Access Control
 Partition
 Partition Key
Access control
 partitions
 partition keys 
physical resource grouping
Consistency Level
powerful
Strongly
 bounded stability
 session
 eventual
Strong
 Bounded Stability
 Session
 Eventual