What is the relationship between RAG and vector databases?

Written by
Jasper Cole
Updated on:June-18th-2025
Recommendation

Explore the key roles of RAG and vector database in large language models.

Core content:
1. Definition of RAG and its components
2. Functions and application scenarios of vector database
3. How RAG and vector database work together to enhance generation accuracy

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

What is RAG


RAG  stands for Retrieval-Augmented Generation, which is a  large language model application architecture that combines "retrieval" and "generation" :

  1. Retrieval : Retrieve content related to the user's question from a document repository

  2. Augmented Generation : The retrieved content is fed into a large language model (such as GPT) along with the user question to generate more accurate and context-rich answers



What is the vector database?


Vector databases (such as FAISS, Milvus, Pinecone, Weaviate) are databases designed specifically to  store and efficiently search high-dimensional vector representations  . You can understand it as:

  • Text (or image, audio) → through the embedding model → converted to vector

  • Vector → Store in vector database

  • User question → also converted into vector → used for similarity search in vector database (usually cosine similarity or Euclidean distance)



Relationship between RAG and vector databases


The "R" (retrieval) part of RAG  is often implemented through a vector database .

The specific process is as follows:

  1. Knowledge preprocessing :

  • The document is sliced ​​(chunking), and each small piece of text is converted into a vector (embedding);

  • Vectors are stored in a vector database.

  • When users ask questions :

    • The problem is also converted into a vector;

    • The vector database performs similar vector retrieval and returns relevant document fragments.

  • Result enhancement generation :

    • Feed the search results + user questions into a large language model to generate answers.



    Diagram understanding (simplified version)




For example

You build an "internal document question-and-answer system":

  • You use a vector database (such as Milvus) to store embeddings of all employee manuals, financial reports, and technical documents;

  • User asked: "What is our sales target for 2023?"

  • The system embeds the question into a vector and then finds similar document paragraphs in the vector database;

  • A large model (such as GPT-4) then generates a contextual answer.



Summarize


project
RAG
Vector Database
effect
Improve generation accuracy and provide context through retrieval
Storing and searching embeddings (document snippets, questions, etc.)
Core Features
Search + Generate Combination
Quickly find similar content
connect
The common vector database implementation of the RAG retrieval part
Providing similar content support for RAG