10 Lessons Learned from Deploying RAG Agents in Production

Written by
Iris Vance
Updated on:June-24th-2025
Recommendation

Practical experience in deploying RAG agents, a must-read guide for enterprise AI transformation.

Core content:
1. The importance of system thinking in the deployment of RAG agents
2. The "context paradox" faced by enterprise AI and its solution strategy
3. 10 key experiences extracted from Fortune 500 company cases

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Agents in Production: Insights from Contextual AI's CEO" class="rich_pages wxw-img" data-imgfileid="100000657" src="https://api.ibos.cn/v4/weapparticle/accesswximg?aid=110617&url=aHR0cHM6Ly9tbWJpei5xcGljLmNuL3N6X21tYml6X2pwZy9uNEtRdm1FR1g1Q1M0V3lZR2 ZKRmlhR0dHaWJ3OWVtZHdaNXAyelZuU0NWc3FZOVJxeFRpYUYzT2hwMkNTV0FWdG5Sc2l jNWJ2a2twZk84VGtwZ1BNVmtqQ2cvNjQwP3d4X2ZtdD1qcGVnJmFtcA==;from=appmsg" data-type="jpeg" style="display: block;margin-top: 0px;margin-right: 10px;margin-bottom: auto;margin-left: 10px;max-width: 100%;width: 100%;height: 100%;border-top-style: none;border-bottom-style: none;border-left-style: none;border-right-style: none;border-top-width: 3px;border-bottom-width: 3px;border-left-width: 3px;border-right-width: 3px;border-top-color: rgba(0, 0, 0, 0.4);border-bottom-color: rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-right-color: rgba(0, 0, 0, 0.4);border-top-left-radius: 0px;border-top-right-radius: 0px;border-bottom-right-radius: 0px;border-bottom-left-radius: 0px;object-fit: fill;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;">

AI agents show incredible potential, but enterprises often struggle to get real value beyond pilots. The "context paradox" is a major obstacle - AI excels at complex tasks, but struggles to understand enterprise-specific contexts. This article is based on the experience of Douwe Kiela, CEO of [Contextual AI], and is suitable for any team deploying retrieval-augmented generation (RAG) systems. The article summarizes 10 key lessons for scaling AI for Fortune 500 companies, focusing on systems thinking, specialization, and production readiness.

I’m Douwe Kiela, CEO of [Contextual AI]. Today I want to talk about RAG systems in production, specifically RAG agents, and share what I’ve learned from both AI research and enterprise AI company management.

The latest generation of large language models (LLMs) have demonstrated amazing reasoning capabilities. However, their true value in an enterprise setting requires applying these capabilities to the “right” enterprise data. In the craze over AI agents, we must keep in mind this classic adage:

" Garbage in; garbage out "

Language models can only work effectively in the right context.

Opportunities and Challenges of Enterprise AI

The opportunity for enterprise AI is huge—[McKenzie] predicts it could add $4.4 trillion in value to the global economy. Everyone wants a piece of the action. But frustration is palpable. Many VPs of AI are under pressure to deliver on return on investment (ROI). [Forbes] Research shows that only a quarter of businesses are actually benefiting from AI.

The context paradox in AI

Why does this dilemma exist? It reflects Moravec's paradox in robotics: tasks that are simple for humans (such as vacuuming) are difficult for robots, while complex tasks (such as playing chess) are easier .

Likewise, enterprise AI faces the “paradox of context.” LLMs excel at tasks like coding or solving math problems, even outperforming humans. But they have great trouble putting information into the right “context.” Humans, especially experts, can do this easily with years of experience and intuition.

This paradox is the key to improving ROI. Currently, AI focuses on convenience and efficiency through general assistants. But enterprises are pursuing "differentiated value" and business transformation. Achieving this higher value requires better handling of enterprise-specific environments; as the demand for value increases, the demand for context also increases.

It was this realization that led us to found [Contextual AI] two years ago. The following are lessons learned from deploying enterprise RAG systems at scale, with a focus on building robust systems for Fortune 500 companies.

10 Lessons Learned from RAG Agents in Production

1. Think systems, not just models

Language models are great, but they’re typically only 20% of a complete system, and especially in enterprise AI, that often involves Retrieval-Augmented Generation (RAG). RAG, which my team and I originally pioneered at [Facebook AI Research], is the standard way to get generative AI to work with your data.

People often focus entirely on the new LLM and lose sight of the surrounding systems that actually solve the problem. An average model in a good RAG pipeline can outperform a great model in a bad pipeline. "Focus on the whole system; the model is just one component."

2. Specialization leads to excellence

Enterprise expertise is your most valuable asset. The goal is to unlock organizational knowledge. Generic assistants cannot match the deep expertise within a company. "Specialization" is the key to effectively capture this.

At [Contextual AI], we call this "specialization over general artificial intelligence." While general artificial intelligence (AGI) has its place, solving complex domain-specific problems requires specialization to achieve better results. It may seem counterintuitive in the midst of the AGI craze, but specialization makes it easier to solve real business problems.

3. Data is your moat (at scale)

Over time, a company is defined by its data as employees come and go. This data represents the company’s long-term identity and competitive advantage. A common misconception is that data must be completely clean to be usable by AI. The real challenge and opportunity lies in making AI work efficiently on “large-scale noisy data”.

Doing this unlocks differentiated value and creates a competitive advantage because your unique data defines your company’s personality.

4. Design for production, not just for pilots

The hard truth: Building a RAG Demo system is relatively easy. A framework and a few documents usually get good reviews from the team. But the leap to "production" is huge:

  • Scale from tens of documents to millions
  • Supports thousands of users
  • Handles many different use cases
  • Meet stringent safety compliance requirements

Existing open source tools often fall short at this scale. The gap between pilot and production is huge. "Design for production from day one" to avoid costly pitfalls.

5. Completion is more important than perfection

In production deployment, "done is better than perfect". Let real users try your RAG agent as early as possible, even if it is just a minimum viable product. Collect feedback and iterate to reach a "good enough" state.

Waiting too long for perfection will make the leap from demo to production more difficult. Iteration based on real user feedback is critical to the successful deployment of enterprise AI.

6. Let engineering teams focus on creating value, not tedious tasks

To achieve speed and iteration, ensure engineers focus on delivering "business value and differentiation" rather than mundane tasks. It's easy for engineers to get caught up in optimizing chunking strategies or perfecting prompts - tasks that should ideally be abstracted away by a robust platform.

Let engineers focus on doing work that is truly important to competitiveness.

7. Make AI easy to use and integrate

Deploying generative AI in production is not the same as ensuring it will be widely adopted. If excessive risk controls make the system unusable, or if users don’t understand how to use it effectively, these systems often sit idle.

Therefore, it is crucial to "improve the ease of use of AI". This is not just about enabling AI to process enterprise data, but also about "seamlessly integrating it into users' existing workflows". Tighter workflow integration can significantly increase the chances of successful adoption.

8. Design for “wow moments”

Driving usage requires stickiness, often triggered by a “wow moment” — the moment when a user instantly understands the value of a tool. Design your onboarding and initial user experience to make that moment come quickly.

For example, at [Qualcomm], where our systems support thousands of engineers around the world, one user was pleasantly surprised when they found an answer in a document that had been buried for seven years, instantly solving a long-standing problem. These “small wins” are a powerful driver of adoption.

9. Focus on observability, not just accuracy

Accuracy is important but is becoming a baseline requirement. Achieving 100% is often not possible. Businesses are increasingly concerned with managing the inevitable inaccuracies (5-10% gap).

After exceeding the minimum accuracy threshold, the focus shifts to managing inaccuracy through "observability". This requires robust assessment methods and audit trails, especially in regulated industries. "Proper attribution" (linking answers to source documents) in the RAG system is critical. Implement post-processing checks to validate claims and ensure evidence-based responses.

10. Be Ambitious

Many AI projects fail not because the goals were too high, but because they were too low. Deploying generative AI for menial tasks, such as basic HR queries, yields a tiny return on investment and often results in wasted resources.

Instead, "aim for ambitious goals" that provide substantial rewards when you succeed. We are in a time of change. AI is poised to reshape society. Those working in AI have the opportunity to drive meaningful change. Don't settle for the low-hanging fruit, aim higher.

in conclusion

The context paradox remains a key challenge for enterprise AI. By adopting these lessons—think systems, specialize, design for production, iterate quickly, focus on value, ensure ease of use, create “wow” moments, value observability, and dare to pursue—you can turn these challenges into significant opportunities.