Ten suggestions for implementing RAG in real scenarios and how to improve personalization in RAG?

Written by

Silas Grey

Updated on:June-30th-2025

This article first looks at Lao Liu’s ten suggestions on the implementation of RAG . These are some inspirations I’ve gained from doing RAG-related work in the past two years, and are written down for your reference.

In addition, RAG has a development direction, which is to move towards personalization. There is also a technical summary , so let’s take a look at it.

Grasping the fundamental issues, identifying the root causes, specializing and systematizing them will lead to more in-depth thinking. Let's work together.

1. Ten suggestions for implementing RAG in real scenarios

RAG is everywhere, but it is patched up, and there are many variants such as GraphRAG, Multimodal RAG, Deepresearch, etc. Everyone has a copy of RAG's solution, but various problems still arise in the actual implementation process.

Last night, Lao Liu gave a keynote speech titled "RAG's Fancy Variants and Implementation Suggestions - GraphRAG or Multimodal RAG or Deepresearch?" in the A2M Artificial Intelligence Innovation Summit pre-sharing online sharing session. He talked about some interesting things and gave these 10 suggestions at the end for your reference:

1. Don't take RAG just for the sake of taking RAG , especially for the types like NL2SQL and KBQA. If you have solved it well before, don't bother with it again.

2. Don’t add variants just for the sake of adding variants . Don’t add GraphRAG, Multimodal RAG, DeepResearch, etc. if you can avoid it. Just make the most basic RAG.

3. Universal RAG is a standard product . Standard products can never solve optimization problems, so this idea should be abandoned.

4. RAG itself is a rag, a patch made for specific business problems . We must have this awareness and build RAG for business, rather than building business for RAG . We must analyze and evaluate specific cases first. An available RAG must have a lot of routing logic.

5. There are many open source RAG frameworks at present. Their significance is not for production , but for rapid scene verification. We need to demystify open source frameworks.

6. If you can write it yourself, do it . RAG does not have many complicated things. The homogeneity and black-box nature of open source frameworks are not conducive to problem location and should be abandoned appropriately .

7. RAG itself is ubiquitous. It is a framework rather than a separate technology. It is more of an engineering architecture .

8. What determines whether RAG is useful is not the RAG technology itself , but whether the user's problem domain is modeled clearly and the design of the business implementation logic.

9. The implementation is always based on the 80/20 principle . Many optimization solutions are designed to solve the 20% long-tail problem. We need to figure this out and measure the ROI input-output ratio .

10. RAG document analysis is necessary , but it is not necessary to achieve 100% restoration. This is a wrong path . You should invest in it, but don't pay too much attention to it. Document analysis is a means, not an end ;

2. How to improve personalization in RAG?

RAG has a development direction, which is towards personalization. For example, the recent work " A Survey of Personalization: From RAG to Agent " (https://arxiv.org/pdf/2504.10147) is a technical summary that introduces how to effectively integrate personalized information in different stages of RAG (pre-retrieval, retrieval and generation) and agent-based personalization systems.

The main technical points used are here:

1. Personalization in the pre-search phase

In the pre-retrieval phase, query processing (Q) uses personalized information (such as query rewriting or expansion) to refine the original query.

Query rewriting can be divided into direct personalized query rewriting and assisted personalized query rewriting. Direct personalized query rewriting uses direct models. Assisted personalized query rewriting uses retrieval, reasoning strategies and external memory.

2. Personalization in the retrieval phase

In the retrieval phase, the retriever (R) uses the personalized information (p) to obtain relevant documents from the corpus (C). The retrieval process can be introduced into three steps: indexing, retrieval, and post-retrieval.

The indexing stage can organize the knowledge base data by generating user embeddings. The retrieval stage can be divided into dense retrieval, sparse retrieval, prompt retrieval and other methods. The post-retrieval stage mainly improves the retrieval results through rearrangement, summary and compression.

3. Personalization in the generation phase

In the generation phase, the generator (G) combines the retrieved documents, task-specific prompts, and user preference information (p) to generate customized content. Personalized generation can be achieved through explicit and implicit preference injection.

Explicit preference injection includes direct integration prompts, summary enhancement prompts, and adaptive prompts, while implicit preference injection is achieved through efficient parameter fine-tuning and reinforcement learning methods.

4. Personalization from RAG to agent

Personalized LLM agent systems dynamically combine user context, memory, and external tools or APIs to support highly personalized and goal-oriented interactions. Personalized understanding, personalized planning and execution, and personalized generation are key components of agent systems.

Personalization understanding includes user profile understanding, role understanding, and user-role joint understanding. Personalization planning and execution include memory management and tools and API calls. Personalization generation emphasizes alignment with user facts and preferences.