I can’t seem to eat any more of RAG’s cakes…

Two years of RAG technology practice, the collision between ideal and reality.
Core content:
1. The original intention and promise of RAG technology: the "intelligent brain" of the enterprise knowledge base
2. Landing dilemma: problems and challenges in real-world applications
3. Technology and cost: challenges and customer feedback in optimizing RAG
After two years of RAG, what did we get in return? A bunch of empty PPTs?
Two years ago, RAG was hyped up as a savior, claiming that it could transform enterprise knowledge bases from a pile of rigid documents into a thinking "intelligent brain". At the technical conference, major manufacturers threw out the halo of "vector database" and "semantic search", vowing that RAG could accurately dig out answers and analyze, reason and gain insights like an expert.
Looking back, standing in front of this pile of overhyped concepts, I just want to ask: Has your RAG been implemented? Has the customer agreed? Or did the company spend real money, but in return it was just the self-satisfaction of the technical team and the "GenAI" label all over the screen? It took two years for RAG to go from the altar to the quagmire.
In 2023, generative AI is as hot as a volcanic eruption, and RAG is touted as a "must-have" for enterprise AI. The vision it promises is tempting enough: use semantic retrieval to grasp the soul of knowledge, and then use large models to generate smooth answers .
Imagine throwing in a bunch of company documents, and RAG can spit out financial report interpretation, product comparison, and even industry trend analysis, saving time and effort and looking high-end. But two years later, reality is like a ruthless slap in the face, leaving everyone dizzy.
The answer is clearly in the document, but RAG just says "not found"; when asked about a variation of an industry term, it looks confused and has no idea at all; when faced with a complex problem that requires linking multiple documents together, RAG simply fails and has difficulty piecing together the answer.
Have you ever experienced this kind of suffocating operation? Endless optimization, adjusting slices, piling up knowledge graphs, and even spending money to fine-tune models , but what is the result? Countless computing power is burned, but the effect is like squeezing toothpaste, and the customer frowns and says: "This thing is not as good as our original search box!" How can this be called "smart"? It feels like being slapped in the face by the customer and having to go up to him.
We peeled off the concept of RAG and found the truth: RAG is a half-baked product. Its core is just a semantic search tool wrapped in a generation module, and it is touted as the "future of knowledge management" . The vector database sounds high-end and can capture the "semantic soul" of synonymous expressions, but to put it bluntly, it is just a finder, not a thinking machine.
Real understanding, reasoning, and insight all rely on big models. RAG's search tricks are simply not enough. If you ask it a more complex question, such as "analyze the gap between our products and competitors' products", what can it do? It will pull out a few document fragments and give you a bunch of fragmented information. Reasoning? No way!
What’s even more annoying is that optimizing RAG is like changing tires on a broken car. No matter how hard you try, it won’t go any faster . Vector search works fine in the lab with small data sets, but when it comes to enterprises, it immediately shows its true colors when faced with messy documents, miscellaneous terminology, and even contradictory data.
Have you tried adjusting the slicing strategy? Have you tried stacking knowledge graphs? Have you tried to spend millions to fine-tune the model? Congratulations, the effect may be a little better, but the customer still thinks the answer is stupid, and the maintenance cost has increased several times. This does not seem to be a technological advancement, but more like companies paying for the self-entertainment of the academic community. Sometimes, you can ask yourself: Why is RAG better than the old solution?
Let's tear off the high-sounding mask and present facts and reason. What is the "revolutionary" nature of RAG? Compared with traditional full-text search, how is it advanced? Don't mention "semantic matching" to me. In actual use, the hit rate of RAG is often far behind that of full-text search. Full-text search technology has been honed for twenty years and is as stable as an old dog. It has low cost, fast deployment, and controllable effects. What do customers want? Quick, accurate and ruthless answers! Do they care whether you use RAG or the ancestral search box? But you are so good, you complicate simple problems, waste money, people and time, just to add a sentence in the report: "We use cutting-edge RAG technology"?
Digging deeper, RAG's failure is not a technical problem, but a thinking problem. The scenarios of enterprise knowledge management are varied. Some need to find a clear fact, some need to connect information and analyze trends, and some simply want to save manpower. What about RAG? It tried to be all-powerful, but ended up doing nothing well. In the customer service scenario, the answers it turned out were not as good as the FAQs compiled by humans; in strategic analysis, it couldn't even support the most basic logical deduction. Sometimes, have you ever asked yourself: Do customers really need such "high-tech"? Or have we plunged into the fog of technology and forgotten the ultimate goal?
Wake up, and try a different approach. Don't get stuck in the RAG trap. Change your mindset and the problem will be much simpler. The core of enterprise knowledge management is to push the right answers to users quickly, accurately and ruthlessly. Complex reasoning? Multi-document analysis? Those are rare scenarios, don't treat them as universal needs. What is a reliable solution? Use a big model as a "question translator" to break down users' messy questions into clear keywords and intentions; then use full-text search to find answers. This thing is so stable that it can find something for you no matter how messy the document is; finally, let the big model package the results into smooth answers, which will make customers feel comfortable and save the team from maintenance.
This approach is not fancy, but it works. The documents returned by full-text search are more complete than the fragmented fragments of RAG, the context is not lost, and the answers are reliable. What about the cost? It is much cheaper than RAG. What about deployment? It can be launched in a week. What about the effect? Customers nod their heads and leaders are relieved. Don’t you think this is better than spending money on RAG? Stop dreaming, customers want results.
The future of RAG? Perhaps big companies and academic circles can continue to play around. Hybrid Search, Agentic RAG, and GraphRAG sound impressive, but companies are not laboratories, and budgets are not blown by the wind. The reality of 2025 is here: RAG is an immature prototype, and the pitfalls of engineering implementation are higher than Mount Everest. Technical leaders, do you dare to ask yourself: What exactly has the RAG project solved for customers in the past two years? Did you really revitalize the knowledge base, or did you just accumulate a bunch of empty shell PPTs? Stop using "cutting-edge technology" as a shield. Customers want results, not your technical sentiments. Wake up, let go of your obsession with RAG, return to customer needs, and choose a path that can run. Don't let the company's budget feed the ungrateful for another two years.