Woter AI detection.Hurry - ends Jun 29th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

LOGIN

TRY FOR FREE

Quick introduction to RAG related terms

Written by

Iris Vance

Updated on:June-18th-2025

RAG ( Retrieval-Augmented Generation)

1. Core components of RAG architecture

the term	meaning
Retriever	Responsible for finding documents or snippets related to user questions from external knowledge bases (such as Top-k retrieval in vector databases).
Generator	Usually a large language model (such as GPT, T5) uses the retrieved information to generate the final answer.
Index	The core data structure of the retrieval system, used to quickly find documents. Usually a vector index.
Knowledge Base / Corpus	A collection of content that stores structured or unstructured knowledge from which the RAG system retrieves relevant information.

2. Embedding and vector retrieval

the term	meaning
Embedding	Convert text into vectors for semantic comparison and retrieval.
Dense Retrieval	Using semantic vectors (such as DPR and BERT) for text retrieval is better than the traditional TF-IDF method.
Vector Store	Databases for storing document vectors, such as FAISS, Pinecone, Milvus, Weaviate, etc.
ANN (Approximate Nearest Neighbor)	An algorithm for efficiently finding similar vectors, often used in large-scale vector retrieval.

3. Search Technology

the term	meaning
DPR (Dense Passage Retrieval)	The dense retrieval method proposed by Facebook trained the Query Encoder and Passage Encoder.
BM25	A classic sparse text retrieval algorithm based on word frequency, commonly used in traditional search engines.
Hybrid Retrieval	At the same time, the results of sparse retrieval (such as BM25) and dense retrieval (such as DPR) are combined to improve the recall rate.

4. Generation and Context Control

the term	meaning
Context Window	The maximum input length that LLM can handle. If it exceeds the length, it will be truncated.
Chunking	Split long documents into smaller chunks to fit within the retrieval and context window constraints.
Top-k Retrieval	Returns the top k document chunks or fragments that are most relevant to the query.
Prompt Engineering	Design prompt words to better guide the language model to generate answers using the retrieved content.
Grounding	Ensure that the generated content is based on real search results, not hallucinations.

V. Related technologies and models

the term	meaning
Reranking	The preliminary search results are scored and sorted again to improve quality.
Query Expansion	Enhance search results by adding synonyms, hyponyms, and similar words.
Multi-hop Retrieval	Supports complex question answering across multiple documents or query steps.
Fusion-in-Decoder (FiD)	A generative architecture proposed by Google that fuses multiple retrieved documents into the decoder.
Retriever-Reader Architecture	The traditional question-answering architecture of retriever + reader is the predecessor of RAG.

6. RAG deployment related

the term	meaning
Cold Start	The problem of lack of valid retrieval results or embedded representations when the system is first run.
Latency	The total time consumption of retrieval + generation is one of the key points of RAG system optimization.
Caching	Cache common search or generation results to improve performance.
Incremental Indexing	Supports a mechanism to add new documents without rebuilding the entire index.