Is RAG dead? No, it will dominate the future of AI

Written by

Audrey Miles

Updated on:June-25th-2025

Is RAG dead? No, it will dominate the future of AI

Every few months, the same thing happens in the AI community: a more powerful model is released, a parameter breakthrough is announced, and people start shouting: "RAG is dead." The most recent one was when Meta released Llama 4 Scout, which has tens of millions of context windows. Overnight, RAG seemed to be on the edge of a cliff again.

But the problem is that every time people shout "RAG is dead", they underestimate the essence of RAG. RAG has never been about simply expanding the context window or simply making up for the shortcomings of model memory. Five years ago, when we first proposed RAG at Meta, the original goal was actually very simple: inject external knowledge into the model in real time to make up for the limitations of pre-trained data.

The truth we discovered is that no matter how language models evolve, they can never get rid of three fatal weaknesses: the inability to directly access private data, the obsolescence of knowledge, and the frequent occurrence of "hallucinations". Models are always trapped in the boundaries of the world defined by training data, but the real world is not static, but changes rapidly and continues to expand.

Many people think that since the context window is getting bigger and bigger, as long as enough data is stuffed into the model, all problems can be solved. But this idea is obviously too naive. Imagine, do you flip through the textbook from beginning to end every time you look for an answer? Obviously not, doing so is not only inefficient, but also absurd. And now, someone is repeating similar absurd logic in the field of AI.

Truly excellent tools never require us to give up other tools. A truly effective system always relies on multiple technologies working together. In a computer, the hard disk, memory, and network interface each have their own functions, and the hard disk will never be abandoned just because the memory capacity is increased. Similarly, the future of AI will definitely not be dominated by a single technology, but by the integration of RAG, fine-tuning, large context windows, and other technologies, each taking advantage of its own strengths.

Humans naturally like simple binary oppositions, black or white, either this or that. But in the field of technology, this opposition is often false and even misleading. When people simply oppose RAG to technologies such as large context windows, fine tuning, and MCP (Model Context Protocol), they ignore that these technologies are actually complementary. Overly simplistic declarations are easy to spread on social media, but real-world applications are always more complex, more detailed, and more practical than slogans.

So, the next time you see a high-profile declaration of "RAG is dead", you might as well stop and think about whether this is another misunderstanding of the nature of technology? Perhaps, those who truly understand the essence of RAG will never see it as a win or loss in a technological competition, but as a necessary infrastructure, an existence that will never really die.

Real technological progress does not replace old tools, but expands and improves the boundaries of old tools. As long as AI still needs to process infinitely expanding amounts of information and as long as the model still has inherent limitations, RAG will never be outdated. It does not need to be resurrected because it has never really died.