Golang implements document vector indexing and retrieval system (RAG) based on Redis

Written by
Clara Bennett
Updated on:June-25th-2025
Recommendation

Master Golang and Redis to build an efficient document vector indexing and retrieval system.

Core content:
1. Use Redis to implement document vector retrieval and build a RAG knowledge base
2. Eino framework introduction and technology stack combined with large language models
3. System architecture details and project operation test guide

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

 

Preface

Hello everyone, this is Bai Ze. This article will explain how to use Redis vector retrieval and LLM to build a RAG knowledge base. The knowledge base stores the introduction of the Eino framework. Each time, try to get the top k related information from the Redis vector index and use LLM to summarize and reply; when there is no relevant knowledge, it prompts that the document was not found, which limits the free play of the large model.

The technology stack used is as follows:

Language: go1.22

Workflow framework: Eino (Byte open source large model workflow development framework)

Vector storage and retrieval: Redis

Large language model: doubao-pro-32k-241215

Vectorized model: doubao-embedding-large-text-240915

?The project has been open sourced at the following address: https://github.com/BaiZe1998/go-learning

Here is an explanation. In the current case, the code for the index building phase is taken from: https://github.com/cloudwego/eino-examples

System Architecture

System architectureAnswer generation phaseQuery retrieval phaseIndex construction phaseMarkdown fileFile loaderDocument segmenterEmbedding modelDocument vectorRedis vector databaseUser question embedding modelQuery vectorKNN vector searchTopKRelated documentsPrompt constructionEnhanced promptLarge language model generationAnswer retriever\nRetrieverRAGSystem generator\nGenerator parameter configuration\ntopK, etc.

Project Operation

  1. 1. Docker starts the default repository
cd eino_assistant
docker-compose up -d
#The  redis started in this way has some Eino document data that has been vectorized built in
  1. 2. Environment variable settings
#In  the knowledge base construction phase, a document vectorization model is needed
#In  the retrieval enhancement stage, a large language model is needed to summarize the response
cd eino_assistant
source .env
  1. 3. Start the rag system
#Use  redis as a document database and retrieve 3 records at a time
go run eino/rag/cmd/main.go --redis=true --topk=3
  1. 4. Testing
Question> What is Agent

===== 3 related documents retrieved=====

Document [1] Similarity: 0.7705 Title: Untitled
----------------------------------------
# # **What is Agent**
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the future of AI application in life and production...

Document [2] Similarity: 0.7606 Title: Untitled
----------------------------------------
# # **Summarize**
This paper introduces the basic method of building an agent using the Eino framework. Through different methods such as Chain, Tool Calling and ReAct, we can flexibly build an AI agent according to actual needs.
Agent is an important direction for the development of AI technology. It can not only understand the user's intention, but also take the initiative to take action.

Document [3] Similarity: 0.7603 Title: Untitled
----------------------------------------
# # **What is Agent**
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the future of AI application in life and production...

==============================


answer:
An agent is a system that can perceive the environment and take actions to achieve specific goals. In AI applications, agents can autonomously complete complex tasks by combining the understanding ability of large language models and the execution ability of predefined tools. It is the main form of AI application in life and production in the future.

For code snippets of the examples in this article, see: [eino-examples/quickstart/taskagent](https://github.com/cloudwego/eino-examples/blob/master/quickstart/taskagent/main.go) 
  1. 5. Ask questions about information that does not exist in the knowledge base
Question > What is Big Data?

===== 3 related documents retrieved=====

Document [1] Similarity: 0.7647 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-07"
lastmod: ""
tags: []
title: Tool
weight: 0
---

Document [2] Similarity: 0.7488 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Document
weight: 0
---

Document [3] Similarity: 0.7419 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Embedding
weight: 0
---

==============================


answer:
I'm sorry, I don't know what big data is and the documentation doesn't provide any information about it.
  1. 6. Add "big data" related information to the knowledge base
#Create  a big_data.md document in the cmd/knowledgeindexing directory with the following content:
#Big  Data
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating new value.
  1. 7. Regenerate document vectors and add big data information to the Redis index
yucong@yucongdeMacBook-Air eino_assistant % cd cmd/knowledgeindexing 
yucong@yucongdeMacBook-Air knowledgeindexing % go run ./
[start] indexing file: eino-docs/_index.md
[done] indexing file: eino-docs/_index.md, len of parts: 4
[start] indexing file: eino-docs/agent_llm_with_tools.md
[done] indexing file: eino-docs/agent_llm_with_tools.md, len of parts: 1
[start] indexing file: eino-docs/big_data.md
[done] indexing file: eino-docs/big_data.md, len of parts: 1 # You can see that it has been segmented
index success
  1. 8. Test again
Question > What is Big Data?

===== 3 related documents retrieved=====

Document [1] Similarity: 0.8913 Title: Big Data
----------------------------------------
#Big  Data
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating...

Document [2] Similarity: 0.7647 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-07"
lastmod: ""
tags: []
title: Tool
weight: 0
---

Document [3] Similarity: 0.7488 Title: Untitled
----------------------------------------
---
Description: ""
date: "2025-01-06"
lastmod: ""
tags: []
title: Document
weight: 0
---

==============================


answer:
Big Data refers to a collection of data that is large in scale, complex in structure, and cannot be effectively captured, managed, and processed within a reasonable time using traditional data processing tools. Its core value lies in mining the information contained in the data through professional analysis, thereby improving decision-making, optimizing processes, and creating new value.

Core business processes

Index building phase

For this part, see: eino_assistant/eino/knowledgeindexing directory code

flow chart:

The index building phase is essentially a workflow, so you can use Goland's Eino Dev plug-in to visualize it. After completion, click Generate Process Framework Code, and then fill in some business implementations:

  • • File loading: read Markdown documents from the file system
  • • Document segmentation: Split the document into small sections according to logical units such as titles and paragraphs (split by #)
  • • Vector generation: Use embedding models to convert text into high-dimensional vectors (4096)
  • • Redis storage: Store document content, metadata, and vectors in a Redis hash structure

Retrieval stage

See: eino_assistant/eino/rag/retriver.go

  • • User input: User inputs questions in the terminal
  • • Query vectorization: Use the same embedding model to convert the question into a vector
  • • KNN Search: Perform KNN (K Nearest Neighbor) vector search in Redis
  • • Relevant document acquisition: Get the top K documents that are most relevant to the question semantics
// Retrieve retrieves the documents most relevant to the query
func (r *RedisRetriever)   Retrieve(ctx context.Context, query  string , topK  int ) ([]*schema.Document,  error ) {
    // Generate query vector
    queryVectors, err := r.embedder.EmbedStrings(ctx, [] string {query})
    if  err !=  nil  {
        return nil , fmt.Errorf( "Failed to generate query vector: %w" , err)
    }

    if len (queryVectors) ==  0  ||  len (queryVectors[ 0 ]) ==  0  {
        return nil , fmt.Errorf( "The embedding model returns an empty vector" )
    }

    queryVector := queryVectors[ 0 ]

    // Build a vector search query
    searchQuery := fmt.Sprintf( "(*)=>[KNN %d @%s $query_vector AS %s]" ,
        topK,
        redispkg.VectorField,
        redispkg.DistanceField)

    // Perform a vector search
    res, err := r.client.Do(ctx,
        "FT.SEARCH" , r.indexName,  // Index name to perform the search
        searchQuery,    // vector search query statement
        "PARAMS""2"// parameter declaration, followed by 2 parameters
        "query_vector" , vectorToBytes(queryVector),  // binary representation of query vector
        "DIALECT""2"// Query dialect version
        "SORTBY" , redispkg.DistanceField,  // Result sorting field
        "RETURN""3" , redispkg.ContentField, redispkg.MetadataField, redispkg.DistanceField,  // return field
    ).Result()

    if  err !=  nil  {
        return nil , fmt.Errorf( "Failed to execute vector search: %w" , err)
    }

    // Convert the Redis result to a Document object
    return  r.parseSearchResults(res)
}

Answer generation phase

See: eino_assistant/eino/rag/generator.go

  • • Prompt construction: Combine retrieved documents and user questions into enhanced prompts
  • • LLM call: Send augmented hints to the Large Language Model (ARK doubao)
  • • Answer generation: The model generates answers to user questions based on the provided context
// Generate Generate answer
func (g *ArkGenerator)   Generate(ctx context.Context, query  string , documents []*schema.Document) ( string error ) {
    // Combine context information
    context :=  ""
    if len (documents) >  0  {
        contextParts :=  make ([] stringlen (documents))
        for  i, doc :=  range  documents {
            // If there is a title in the metadata, add the title information
            titleInfo :=  ""
            if  title, ok := doc.MetaData[ "title" ].( string ); ok && title !=  ""  {
                titleInfo = fmt.Sprintf( "title: %s\n" , title)
            }
            contextParts[i] = fmt.Sprintf( "Document fragment [%d]:\n%s%s\n" , i+ 1 , titleInfo, doc.Content)
        }
        context = strings.Join(contextParts,  "\n---\n" )
    }

    // Build prompt
    systemPrompt :=  "You are a knowledge assistant. Answer user questions based on the documentation provided. If the documentation doesn't have relevant information, be honest and say you don't know, don't make up an answer."
    userPrompt := query

    if  context !=  ""  {
        userPrompt = fmt.Sprintf( "Answer my question based on the following information:\n\n%s\n\nQuestion: %s" , context, query)
    }

    // Build request
    messages := []chatMessage{
        {Role:  "system" , Content: systemPrompt},
        {Role:  "user" , Content: userPrompt},
    }

    reqBody := chatRequest{
        Model: g.modelName,
        Messages: messages,
    }

    //Serialize the request body
    jsonData, err := json.Marshal(reqBody)
    if  err !=  nil  {
        return "" , fmt.Errorf( "Serialization request failed: %w" , err)
    }

    // Create HTTP request
    endpoint := fmt.Sprintf( "%s/chat/completions" , g.baseURL)
    req, err := http.NewRequestWithContext(ctx,  "POST" , endpoint, bytes.NewBuffer(jsonData))
    if  err !=  nil  {
        return "" , fmt.Errorf( "Failed to create HTTP request: %w" , err)
    }

    // Add header information
    req.Header.Set( "Content-Type""application/json" )
    req.Header.Set( "Authorization" , fmt.Sprintf( "Bearer %s" , g.apiKey))

    // Send the request
    client := &http.Client{}
    resp, err := client.Do(req)
    if  err !=  nil  {
        return "" , fmt.Errorf( "Failed to send request: %w" , err)
    }
    defer  resp.Body.Close()

    // Read the response
    body, err := io.ReadAll(resp.Body)
    if  err !=  nil  {
        return "" , fmt.Errorf( "Failed to read response: %w" , err)
    }

    // Check response status
    if  resp.StatusCode != http.StatusOK {
        return "" , fmt.Errorf( "API returned error: %s, status code: %d"string (body), resp.StatusCode)
    }

    // Parse the response
    var  chatResp chatResponse
    if  err := json.Unmarshal(body, &chatResp); err !=  nil  {
        return "" , fmt.Errorf( "Failed to parse response: %w" , err)
    }

    // Extract the answer
    if len (chatResp.Choices) >  0  {
        return  chatResp.Choices[ 0 ].Message.Content,  nil
    }

    return "" , fmt.Errorf( "API did not return a valid answer" )
}

Main Loop

func main ()   {
    // Define command line parameters
    useRedis := flag.Bool( "redis"true"Whether to use Redis for retrieval enhancement" )
    topK := flag.Int( "topk"3"Number of documents retrieved" )

    flag.Parse()

    // Check environment variables
    env.MustHasEnvs( "ARK_API_KEY" )

    // Build RAG system
    ctx := context.Background()
    ragSystem, err := rag.BuildRAG(ctx, *useRedis, *topK)
    if  err !=  nil  {
        fmt.Fprintf(os.Stderr,  "Failed to build RAG system: %v\n" , err)
        os.Exit( 1 )
    }

    // Display startup information
    if  *useRedis {
        fmt.Println( "Start RAG system (using Redis retrieval)" )
    }  else  {
        fmt.Println( "Start RAG system (without retrieval)" )
    }
    fmt.Println( "Enter question or type 'exit' to exit" )

    // Create an input scanner
    scanner := bufio.NewScanner(os.Stdin)

    // Main loop
    for  {
        fmt.Print( "\nProblem>" )

        // Read user input
        if  !scanner.Scan() {
            break
        }

        input := strings.TrimSpace(scanner.Text())
        if  input ==  ""  {
            continue
        }

        // Check exit command
        if  strings.ToLower(input) ==  "exit"  {
            break
        }

        // Handle the problem
        answer, err := ragSystem.Answer(ctx, input)
        if  err !=  nil  {
            fmt.Fprintf(os.Stderr, "处理问题时出错: %v\n", err)
            continue
        }

        // 显示回答
        fmt.Println("\n回答:")
        fmt.Println(answer)
    }

    if err := scanner.Err(); err != nil {
        fmt.Fprintf(os.Stderr, "读取输入时出错: %v\n", err)
    }

    fmt.Println("再见!")
}