The "magic mirror" of the big model is here, let RAG cure the "nonsense" of AI

Written by

Jasper Cole

Updated on:June-20th-2025

When AI starts to talk nonsense

I believe that many people often encounter a situation when using large models like DeepSeek, where even it can speak nonsense seriously. For example, when an AI interviewer asked the interviewer "What is the company's overtime culture like?", it actually made up a non-existent welfare policy seriously.

This kind of hallucinatory answer is a fatal weakness of the Large Language Model (LLM). It is like a top student who is very good at mathematics but makes a big mistake when doing a math problem. This time we are going to see how RAG (Retrieval Augmented Generation) can make AI less stupid.

The origin of the "illusion" of the large model

As the saying goes, knowing yourself and your enemy ensures victory in every battle. Let us first understand some of the "illusions" of large models, which mainly come from two flaws:

Knowledge truncation: The model has only seen the training data and knows nothing about information outside the training set. For example, after a large model is trained for 23 years, it will never know who the World Cup champion was in 24 years.

2. Probability game: The essence of a language model is to predict the probability of the next word. When faced with some open-ended questions, it may “piece together” answers that seem reasonable but are completely wrong.

RAG appears, equipping AI with a “search engine”

If the traditional large model is a closed-door model, RAG connects it to the Internet so that it can have more contact with the outside world. When a user asks a question, the system will search for relevant information from an external knowledge base (documents, databases); the retrieved content will be used as context and input into the language model to produce the final answer. For example, when a user asks, "How many kilometers does Tesla's latest model have a range?" RAG will first obtain the latest data from the official website or news library, and then let the model generate an answer based on these facts, rather than making random guesses. Those who have used Tencent Yuanbao may have noticed that it can choose to connect to the Internet while selecting a large model, and this is probably the case. However, there is also some spam in the current domestic network environment. For example, if you check a certain official website, a bunch of advertisements will appear, which may also affect the final answer. Therefore, the establishment of a knowledge base is also very necessary.

RAG's three "killer weapons"

Traditional large model knowledge ends at the node of training data, while RAG can dynamically update the knowledge base. For example, in the financial field, policies and regulations are changing every day, and RAG can ensure that AI keeps pace with the times; RAG can not only generate answers, but also tell you where the answers come from; compared to retraining large models, RAG only needs to optimize retrieval modules and prompt engineering, which is lower cost and faster response.

The emergence of RAG is driving the evolution of AI from "memory type" to "cognition type". Multimodal retrieval enables AI to not only read text, but also view images and videos to build a richer knowledge network; real-time adjustment of retrieval strategies through user feedback makes AI smarter; localized deployment and localized knowledge base can prevent corporate data from being leaked during collaboration, and permissions can be used to control user access rights.

The illusion problem of big models is not unsolvable. The key lies in whether we have equipped it with the auditor RAG. Now many companies are using big models and RAG to build their own local environment. Is your Tuna pair also implementing RAG?