Description of three deployment methods of milvus

Written by

Clara Bennett

Updated on:June-26th-2025

Milvus vector database has three deployment modes: Milvus Lite, Milvus Standalone, and Milvus Distributed, to support different usage scenarios and data scales. Different deployment modes provide the same client application interface.

1. Brief description

Milvus Lite

Milvus Lite It is a Python library of milvus that can be imported into existing applications for integration. As a lightweight version of Milvus, it is very suitable for running tests and development on Jupyter notebooks or smart devices with limited resources.

To integrate Milvus Lite into your application, runpip install pymilvus Install it and useMilvusClient("./demo.db") The statement instantiates a vector database with a local file to persist all data.

Milvus standalone version

Milvus Standalone It is a single-machine server deployment. All components of Milvus Standalone are packaged into a Docker image, which is very convenient to deploy. If you have production workloads but do not want to use Kubernetes, running Milvus Standalone on a single machine with sufficient memory is a good choice. In addition, Milvus Standalone supports high availability through master-slave replication.

Distributed Milvus

Milvus Distributed Deployable on a Kubernetes cluster. This deployment uses a cloud-native architecture, where ingestion loads and search queries are handled by independent nodes, allowing redundancy of key components. It has the highest scalability and availability, and can flexibly customize the resources allocated in each component.Milvus Distributed It is the first choice for enterprise users running large-scale vector search systems in production.

2. How to choose the deployment method that suits you

The Milvus deployment mode can be selected based on the project stage and scale.

Project Phases

Choose based on the development stage of your application:

1. For rapid prototyping

If you need to build prototypes quickly or for learning, such as Retrieval Augmentation Generation (RAG) demos, AI chatbots, and multimodal search, Milvus Lite itself or the combination of Milvus Lite and Milvus Standalone is a good choice. You can use Milvus Lite in a notebook for rapid prototyping and explore various methods, such as different chunking strategies in RAG. Or you may want to deploy applications built with Milvus Lite in small-scale production, serve real users, or verify ideas on larger datasets (e.g., more than a few million vectors). Milvus Standalone is a suitable choice. The application logic of Milvus Lite can still be shared because all Milvus deployments have the same client application interface. Data stored in Milvus Lite can also be ported to Milvus Standalone through command-line tools.

2. Small-scale production deployment

For the early production stage, when the project is still seeking product-market fit and agility is more important than scalability , Milvus Standalone is the best choice. It can still scale to 100 million vectors as long as there are enough machine resources, and the DevOps requirements are much lower than maintaining a K8s cluster.

3. Large-scale production deployment

When your business grows rapidly and the data scale exceeds the capacity of a single server, it is time to consider Milvus Distributed. You can continue to use Milvus Standalone as a development or staging environment and operate a K8s cluster running Milvus Distributed. This can support you to process tens of billions of vectors and flexibly adjust the node size according to your specific workload (such as high read, low write or high write, low read).

4. Local Search on Edge Devices

For searching through private or sensitive information on edge devices, Milvus Lite can be deployed on the device without relying on cloud-based services for text or image search. This is suitable for cases such as proprietary document search or on-device object detection.

Project dataset size

Choose a deployment method that suits you based on the size of your project dataset.

Milvus Lite is recommended for smaller datasets, up to a few million vectors.
Milvus Standalone is suitable for medium-sized datasets and can be expanded to 100 million vectors.
Milvus Distributed is designed for large-scale deployment and can handle datasets ranging from 100 million to tens of billions of vectors.

3. Functional comparison

The three deployment modes are consistent in terms of data types and search functions. The stand-alone version and distributed deployment provide more client support, more data management functions, etc.

Function	Milvus Lite	Milvus standalone version	Distributed Milvus
SDK / Client Software	Python gRPC	Python Go Java Node.js C# RESTful	Python Java Go Node.js C# RESTful
Data Types	Dense Vector Sparse Vector Binary Vector Boolean Integer Floating Point VarChar Array JSON	Dense Vector Sparse Vector Binary Vector Boolean Integer Floating Point VarChar Array JSON	Dense Vector Sparse Vector Binary Vector Boolean Integer Floating Point VarChar Array JSON
Search Function	Vector search (ANN search) Metadata filtering Range search Scalar query Get entity by primary key Hybrid search	Vector search (ANN search) Metadata filtering Range search Scalar query Get entity by primary key Hybrid search	Vector search (ANN search) Metadata filtering Range search Scalar query Get entity by primary key Hybrid search
CRUD Operators	✔️	✔️	✔️
Advanced Data Management	not applicable	Access Control Partition Partition Key	Access control partitions partition keys physical resource grouping
Consistency Level	powerful	Strongly bounded stability session eventual	Strong Bounded Stability Session Eventual