Solving the integration problem of Dify and Milvus: a practical guide to avoid pitfalls from zero to one

Written by

Caleb Hayes

Updated on:July-03rd-2025

1. Guide to deploying stand-alone Milvus on WSL Linux

1. Environment preparation and hardware verification

Hardware requirements (at least one of the following must be met):

Software Dependencies :

Docker 19.03+ and Docker Compose 1.25.1+

2. Download Milvus installation yml file

# 1. Download the official deployment script
$ wget https://github.com/milvus-io/milvus/releases/download/v2.5.6/milvus-standalone-docker-compose.yml -O docker-compose.yml

3. Modify the docker-compose.yml configuration

services:
  etcd:
    restart: always  #Ensure that Docker will automatically start after restart
  ....
  minio:
    restart: always   #Ensure that Docker will automatically start after restart
    ports:
      -  "19001:9001" # Ensure that there will be no Minio port conflict when installing RAGflow later  
      -  "19000:9000" #  
  ....
  standalone:
    restart: always
  ....

4. Modify milvus.yaml in the container

# Enter the Milvus container (replace CONTAINER_ID)
docker  exec  -it milvus-standalone /bin/bash
# Enable authentication
sed -i  's/authorizationEnabled: false/authorizationEnabled: true/g'  /milvus/configs/milvus.yaml

docker  exec  -it milvus-standalone cat /milvus/configs/milvus.yaml | grep authorizationEnabled

Displays as: true

5. Start the service and verify the service status

docker-compose up -d

Milvus visual interface ATTU checks whether the connection is normal (Windows installation)

https://github.com/zilliztech/attu.git

2. Guide to deploying dify on WSL Linux

1. Basic environment configuration

# Step 1. Clone the repository (domestic users are recommended to use the mirror source)
git  clone  https://github.com/langgenius/dify.git

# Step 2. Configure the env environment variable
cd  dify/docker
cp .env.example .env
sudo vim .env 
---------------------------------------------
# The type of vector store to use.
# VECTOR_STORE=weaviate # Comment out the default vector library configuration
VECTOR_STORE=milvus

# The milvus uri.
MILVUS_URI=http://172.18.0.1:19530
MILVUS_TOKEN=
MILVUS_USER=your_user
MILVUS_PASSWORD=your_pass
MILVUS_ENABLE_HYBRID_SEARCH=True 
--------------------------------------------
Step 3: Modify the docker-compose.yaml configuration
# Comment out Dify's Milvus configuration to avoid duplicate downloads and conflicts with the already installed Milvus
 Milvus vector database services
  etcd:
    container_name: milvus-etcd
     ....
  minio:
    container_name: milvus-minio
     ....
  milvus-standalone:
    container_name: milvus-standalone
     ....

2. Startup and Integration

docker-compose up -d
# Here, redis, Postgre,

The result is shown in the figure above, indicating that Milvus+Dify is initially configured successfully.

3. Start Dify → Create Knowledge Base

-- In Attu, you can see the corresponding collection generated, indicating that the deployment and integration of Milvus+Dify has been successful

Tips to avoid pitfalls :

Port conflict : If port 8080 is occupied, you need to modify it .env In NGINX_PORT and EXPOSE_NGINX_PORT
Vector library connection failed : Check whether Milvus port 19530 is open (telnet 127.0.0.1 19530）
GPU support : For GPU acceleration, install NVIDIA Container Toolkit and docker-compose.yml Add to deploy.resources.reservations.devices Configuration

3. Typical Problem Solution Library

Problem phenomenon	Troubleshooting steps	Solution
Milvus does not respond after startup	1. Inspection `docker logs milvus-standalone` 2. Verify CPU instruction set support 3. View `/var/lib/milvus/logs`	Change the hardware environment to support AVX
Dify failed to upload document	1. Check MinIO connection status 2. View API container logs 3. Verify storage volume permissions	implement `chmod -R 777 ./storage`
Hybrid retrieval has low accuracy	1. Check the word segmentation strategy 2. Verify the vector dimension match 3. Test the similarity threshold	Adjustment `similarity_score_threshold` To 0.75-0.85 range
Service crashes during high concurrency	1. Monitoring `docker stats` 2. Analyze OOM Killer log 3. Check thread deadlock	exist `docker-compose.yml` Configuring memory limits in

4. Performance Optimization Suggestions

Caching strategy : Configuring Redis L2 cache for high-frequency queries
Batch processing : Enable for large batches of documents batch_size=500 Parameters to reduce IO overhead
Hardware acceleration : Use GPUs that support Tensor Cores (such as T4/A10) to run BGE-M3 vector models
Cluster deployment : When the data volume exceeds 100 million, it is recommended to use Milvus distributed cluster (requires Kubernetes environment)