What is the relationship between RAG and vector databases?

Written by

Jasper Cole

Updated on:June-18th-2025

What is RAG

RAG stands for Retrieval-Augmented Generation, which is a large language model application architecture that combines "retrieval" and "generation" :

Retrieval : Retrieve content related to the user's question from a document repository
Augmented Generation : The retrieved content is fed into a large language model (such as GPT) along with the user question to generate more accurate and context-rich answers

What is the vector database?

Vector databases (such as FAISS, Milvus, Pinecone, Weaviate) are databases designed specifically to store and efficiently search high-dimensional vector representations . You can understand it as:

Text (or image, audio) → through the embedding model → converted to vector
Vector → Store in vector database
User question → also converted into vector → used for similarity search in vector database (usually cosine similarity or Euclidean distance)

Relationship between RAG and vector databases

The "R" (retrieval) part of RAG is often implemented through a vector database .

The specific process is as follows:

Knowledge preprocessing :

The document is sliced (chunking), and each small piece of text is converted into a vector (embedding);
Vectors are stored in a vector database.

When users ask questions :

The problem is also converted into a vector;
The vector database performs similar vector retrieval and returns relevant document fragments.

Result enhancement generation :

Feed the search results + user questions into a large language model to generate answers.

Diagram understanding (simplified version)

For example

You build an "internal document question-and-answer system":

You use a vector database (such as Milvus) to store embeddings of all employee manuals, financial reports, and technical documents;
User asked: "What is our sales target for 2023?"
The system embeds the question into a vector and then finds similar document paragraphs in the vector database;
A large model (such as GPT-4) then generates a contextual answer.

Summarize

project	RAG	Vector Database
effect	Improve generation accuracy and provide context through retrieval	Storing and searching embeddings (document snippets, questions, etc.)
Core Features	Search + Generate Combination	Quickly find similar content
connect	The common vector database implementation of the RAG retrieval part	Providing similar content support for RAG