Golang implements document vector indexing and retrieval system (RAG) based on Redis

Master Golang and Redis to build an efficient document vector indexing and retrieval system.
Core content:
1. Use Redis to implement document vector retrieval and build a RAG knowledge base
2. Eino framework introduction and technology stack combined with large language models
3. System architecture details and project operation test guide
Preface
Hello everyone, this is Bai Ze. This article will explain how to use Redis vector retrieval and LLM to build a RAG knowledge base. The knowledge base stores the introduction of the Eino framework. Each time, try to get the top k related information from the Redis vector index and use LLM to summarize and reply; when there is no relevant knowledge, it prompts that the document was not found, which limits the free play of the large model.
The technology stack used is as follows:
Language: go1.22
Workflow framework: Eino (Byte open source large model workflow development framework)
Vector storage and retrieval: Redis
Large language model: doubao-pro-32k-241215
Vectorized model: doubao-embedding-large-text-240915
?The project has been open sourced at the following address: https://github.com/BaiZe1998/go-learning
Here is an explanation. In the current case, the code for the index building phase is taken from: https://github.com/cloudwego/eino-examples
System Architecture
System architectureAnswer generation phaseQuery retrieval phaseIndex construction phaseMarkdown fileFile loaderDocument segmenterEmbedding modelDocument vectorRedis vector databaseUser question embedding modelQuery vectorKNN vector searchTopKRelated documentsPrompt constructionEnhanced promptLarge language model generationAnswer retriever\nRetrieverRAGSystem generator\nGenerator parameter configuration\ntopK, etc.
Project Operation
1. Docker starts the default repository
cd eino_assistant
docker-compose up -d
#The redis started in this way has some Eino document data that has been vectorized built in
2. Environment variable settings
#In the knowledge base construction phase, a document vectorization model is needed
#In the retrieval enhancement stage, a large language model is needed to summarize the response
cd eino_assistant
source .env
3. Start the rag system
#Use redis as a document database and retrieve 3 records at a time
go run eino/rag/cmd/main.go --redis=true --topk=3
4. Testing
Question> What is Agent
===== 3 related documents retrieved=====
Document [1] Similarity: 0.7705 Title: Untitled
----------------------------------------
# # **What is Agent**
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the future of AI application in life and production...
Document [2] Similarity: 0.7606 Title: Untitled
----------------------------------------
# # **Summarize**
This paper introduces the basic method of building an agent using the Eino framework. Through different methods such as Chain, Tool Calling and ReAct, we can flexibly build an AI agent according to actual needs.
Agent is an important direction for the development of AI technology. It can not only understand the user's intention, but also take the initiative to take action.
Document [3] Similarity: 0.7603 Title: Untitled
----------------------------------------
# # **What is Agent**
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the future of AI application in life and production...
==============================
answer:
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the main form of AI application in life and production in the future.
For code snippets of the examples in this article, see: [eino-examples/quickstart/taskagent](https://github.com/cloudwego/eino-examples/blob/master/quickstart/taskagent/main.go)
5. Ask questions about information that does not exist in the knowledge base
Question > What is Big Data?
===== 3 related documents retrieved=====
Document [1] Similarity: 0.7647 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-07"
lastmod: ""
tags: []
title: Tool
weight: 0
---
Document [2] Similarity: 0.7488 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Document
weight: 0
---
Document [3] Similarity: 0.7419 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Embedding
weight: 0
---
==============================
answer:
I'm sorry, I don't know what big data is and the documentation doesn't provide any information about it.
6. Add "big data" related information to the knowledge base
#Create a big_data.md document in the cmd/knowledgeindexing directory with the following content:
#Big Data
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating new value.
7. Regenerate document vectors and add big data information to the Redis index
yucong@yucongdeMacBook-Air eino_assistant % cd cmd/knowledgeindexing
yucong@yucongdeMacBook-Air knowledgeindexing % go run ./
[start] indexing file: eino-docs/_index.md
[done] indexing file: eino-docs/_index.md, len of parts: 4
[start] indexing file: eino-docs/agent_llm_with_tools.md
[done] indexing file: eino-docs/agent_llm_with_tools.md, len of parts: 1
[start] indexing file: eino-docs/big_data.md
[done] indexing file: eino-docs/big_data.md, len of parts: 1 # You can see that it has been segmented
index success
8. Test again
Question > What is Big Data?
===== 3 related documents retrieved=====
Document [1] Similarity: 0.8913 Title: Big Data
----------------------------------------
#Big Data
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating...
Document [2] Similarity: 0.7647 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-07"
lastmod: ""
tags: []
title: Tool
weight: 0
---
Document [3] Similarity: 0.7488 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Document
weight: 0
---
==============================
answer:
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating new value.
Core business processes
Index building phase
For this part, see: eino_assistant/eino/knowledgeindexing directory code
flow chart:
The index building phase is essentially a workflow, so you can use Goland's Eino Dev plug-in to visualize it. After completion, click Generate Process Framework Code, and then fill in some business implementations:
• File loading: read Markdown documents from the file system • Document segmentation: Split the document into small sections according to logical units such as titles and paragraphs (split by #) • Vector generation: Use embedding models to convert text into high-dimensional vectors (4096) • Redis storage: Store document content, metadata, and vectors in a Redis hash structure
Retrieval stage
See: eino_assistant/eino/rag/retriver.go
• User input: User inputs questions in the terminal • Query vectorization: Use the same embedding model to convert the question into a vector • KNN Search: Perform KNN (K Nearest Neighbor) vector search in Redis • Relevant document acquisition: Get the top K documents that are most relevant to the question semantics
// Retrieve retrieves the documents most relevant to the query
func (r *RedisRetriever) Retrieve(ctx context.Context, query string , topK int ) ([]*schema.Document, error ) {
// Generate query vector
queryVectors, err := r.embedder.EmbedStrings(ctx, [] string {query})
if err != nil {
return nil , fmt.Errorf( "Failed to generate query vector: %w" , err)
}
if len (queryVectors) == 0 || len (queryVectors[ 0 ]) == 0 {
return nil , fmt.Errorf( "The embedding model returns an empty vector" )
}
queryVector := queryVectors[ 0 ]
// Build a vector search query
searchQuery := fmt.Sprintf( "(*)=>[KNN %d @%s $query_vector AS %s]" ,
topK,
redispkg.VectorField,
redispkg.DistanceField)
// Perform a vector search
res, err := r.client.Do(ctx,
"FT.SEARCH" , r.indexName, // Index name to perform the search
searchQuery, // vector search query statement
"PARAMS" , "2" , // parameter declaration, followed by 2 parameters
"query_vector" , vectorToBytes(queryVector), // binary representation of query vector
"DIALECT" , "2" , // Query dialect version
"SORTBY" , redispkg.DistanceField, // Result sorting field
"RETURN" , "3" , redispkg.ContentField, redispkg.MetadataField, redispkg.DistanceField, // return field
).Result()
if err != nil {
return nil , fmt.Errorf( "Failed to execute vector search: %w" , err)
}
// Convert the Redis result to a Document object
return r.parseSearchResults(res)
}
Answer generation phase
See: eino_assistant/eino/rag/generator.go
• Prompt construction: Combine retrieved documents and user questions into enhanced prompts • LLM call: Send augmented hints to the Large Language Model (ARK doubao) • Answer generation: The model generates answers to user questions based on the provided context
// Generate Generate answer
func (g *ArkGenerator) Generate(ctx context.Context, query string , documents []*schema.Document) ( string , error ) {
// Combine context information
context := ""
if len (documents) > 0 {
contextParts := make ([] string , len (documents))
for i, doc := range documents {
// If there is a title in the metadata, add the title information
titleInfo := ""
if title, ok := doc.MetaData[ "title" ].( string ); ok && title != "" {
titleInfo = fmt.Sprintf( "title: %s\n" , title)
}
contextParts[i] = fmt.Sprintf( "Document fragment [%d]:\n%s%s\n" , i+ 1 , titleInfo, doc.Content)
}
context = strings.Join(contextParts, "\n---\n" )
}
// Build prompt
systemPrompt := "You are a knowledge assistant. Answer user questions based on the documentation provided. If the documentation doesn't have relevant information, be honest and say you don't know, don't make up an answer."
userPrompt := query
if context != "" {
userPrompt = fmt.Sprintf( "Answer my question based on the following information:\n\n%s\n\nQuestion: %s" , context, query)
}
// Build request
messages := []chatMessage{
{Role: "system" , Content: systemPrompt},
{Role: "user" , Content: userPrompt},
}
reqBody := chatRequest{
Model: g.modelName,
Messages: messages,
}
//Serialize the request body
jsonData, err := json.Marshal(reqBody)
if err != nil {
return "" , fmt.Errorf( "Serialization request failed: %w" , err)
}
// Create HTTP request
endpoint := fmt.Sprintf( "%s/chat/completions" , g.baseURL)
req, err := http.NewRequestWithContext(ctx, "POST" , endpoint, bytes.NewBuffer(jsonData))
if err != nil {
return "" , fmt.Errorf( "Failed to create HTTP request: %w" , err)
}
// Add header information
req.Header.Set( "Content-Type" , "application/json" )
req.Header.Set( "Authorization" , fmt.Sprintf( "Bearer %s" , g.apiKey))
// Send the request
client := &http.Client{}
resp, err := client.Do(req)
if err != nil {
return "" , fmt.Errorf( "Failed to send request: %w" , err)
}
defer resp.Body.Close()
// Read the response
body, err := io.ReadAll(resp.Body)
if err != nil {
return "" , fmt.Errorf( "Failed to read response: %w" , err)
}
// Check response status
if resp.StatusCode != http.StatusOK {
return "" , fmt.Errorf( "API returned error: %s, status code: %d" , string (body), resp.StatusCode)
}
// Parse the response
var chatResp chatResponse
if err := json.Unmarshal(body, &chatResp); err != nil {
return "" , fmt.Errorf( "Failed to parse response: %w" , err)
}
// Extract the answer
if len (chatResp.Choices) > 0 {
return chatResp.Choices[ 0 ].Message.Content, nil
}
return "" , fmt.Errorf( "API did not return a valid answer" )
}
Main Loop
func main () {
// Define command line parameters
useRedis := flag.Bool( "redis" , true , "Whether to use Redis for retrieval enhancement" )
topK := flag.Int( "topk" , 3 , "Number of documents retrieved" )
flag.Parse()
// Check environment variables
env.MustHasEnvs( "ARK_API_KEY" )
// Build RAG system
ctx := context.Background()
ragSystem, err := rag.BuildRAG(ctx, *useRedis, *topK)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to build RAG system: %v\n" , err)
os.Exit( 1 )
}
// Display startup information
if *useRedis {
fmt.Println( "Start RAG system (using Redis retrieval)" )
} else {
fmt.Println( "Start RAG system (without retrieval)" )
}
fmt.Println( "Enter question or type 'exit' to exit" )
// Create an input scanner
scanner := bufio.NewScanner(os.Stdin)
// Main loop
for {
fmt.Print( "\nProblem>" )
// Read user input
if !scanner.Scan() {
break
}
input := strings.TrimSpace(scanner.Text())
if input == "" {
continue
}
// Check exit command
if strings.ToLower(input) == "exit" {
break
}
// Handle the problem
answer, err := ragSystem.Answer(ctx, input)
if err != nil {
fmt.Fprintf(os.Stderr, "处理问题时出错: %v\n", err)
continue
}
// 显示回答
fmt.Println("\n回答:")
fmt.Println(answer)
}
if err := scanner.Err(); err != nil {
fmt.Fprintf(os.Stderr, "读取输入时出错: %v\n", err)
}
fmt.Println("再见!")
}