One of the fastest growing use cases for LLMs is RAG, or retrieval augmented generation. In this use case, the context window of an LLM is enhanced with grounded content (like search results, meeting transcriptions, etc), and then the LLM is able to generate a succinct summary.
Combined with use of LLMs for embeddings and retrieval, this has breathed a new life in RAG use cases, especially search.
However, we are now beginning to understand how RAG architecture has scaling challenges. The belief was that if you give vast amounts of data to an LLM, it will be able to sort through it and provide correct results.
Unfortunately, this is turning out to be a problem. We often hear of customers who run into poor results as they add multiple data sources into an existing RAG based search tool. These poor results often show irrelevant or low quality results, even when higher quality and more relevant content is available.
Let me explain by an example. Let’s say I am a salesperson and I want to find information about an opportunity. I have connected my copilot or assistant to a diverse set of sources — from CRM, email, calendar, Slack, Google Drive, etc.
When I ask about the status of an opportunity, a basic RAG-based search engine does not have any insight about where to go — and which source to trust. What if I want to know about the most recent updates on an opportunity. What if I want to find out when an opportunity would close? What if I want to find out when I am meeting a customer next?
As you can tell, a RAG-based architecture will return an unpredictable and generally unreliable set of results. It might highlight results from an email when I ask about the status of an opportunity instead of CRM, or it might highlight results from Slack if I ask about the health of a customer instead of Gainsight . It might go to CRM for the most recent update on an opportunity when it should probably go to Slack or Teams.
You get the point, the chief weakness in this architecture is that a RAG system does not have any ability to decide which content system is the preferred store of certain kinds of content.
To fully appreciate the advancements in agentic RAG, it’s crucial to understand the foundations of retrieval augmented generation. Let’s examine the core principles and mechanics of basic RAG systems.
What is retrieval augmented generation (RAG)?
Retrieval augmented generation (RAG) is an architectural approach that enhances large language models (LLMs) by integrating external knowledge. This method allows LLMs to access and incorporate up-to-date information beyond their initial training data.
The basic RAG process consists of the following components:
A pre-trained language model serving as the base architecture
An external knowledge base containing relevant, current information
A retrieval mechanism to access pertinent data from the knowledge base
An augmentation process that incorporates retrieved information into the model's generation pipeline
This is to say that RAG helps AI give more accurate and up-to-date answers. This architecture enables LLMs to generate responses that are both linguistically coherent and factually grounded in current, domain-specific knowledge.
Importantly, RAG is why various AI tools can now use your company's data to answer questions. They can look at things like chat messages, customer info, and even obscure, old reports to find the answers you're looking for. As a result of this technique — search got smarter. Chatbots became genuinely helpful. The AI world took notice, and RAG applications exploded.
But RAG isn't perfect yet. It's still new and has some problems. We're working on making it even better.
The problem with basic RAG
In short, RAG systems are having trouble with too much data. Many companies thought just adding more info would make AI smarter. But that's proving to be counterproductive.
The main issue is that RAG can't easily decide what's important in all this data. Increasing the volume of data doesn't equate to improved intelligence; instead, it's like searching for a needle in a haystack by adding more hay. This is to say that when companies connect many data sources to their RAG tool, they often get bad answers, even when good info is there.
For example: I’m a salesperson and I want to find information about an opportunity. My AI-powered copilot can look in many places — sales records, emails, calendars, chat messages, and files.
But this RAG-based search system doesn't know where to look first or which source to trust. What if I want to know about the most recent updates on an opportunity? What if I want to find out when an opportunity would close? What if I want to find out when I am meeting a customer next? It might pull information from an email instead of CRM or from Notion instead of Gainsight, leading to unpredictable and generally unreliable results.
Basic RAG often misses the best information. It might ignore expert knowledge and use less reliable sources instead. I've witnessed numerous instances where customers realize their sophisticated RAG tool is producing subpar results, despite having access to the best, most up-to-date information. This isn't merely frustrating — it can be potentially harmful in critical business scenarios.
The fundamental issue is that RAG alone can't choose the best place to look for specific questions. It's not smart enough to understand the context of what it's looking for. Basic RAG is reaching its limits. Just adding more data isn't the answer. We need a smarter way that not only finds info but also comprehends and contextualizes it.
How agentic AI makes RAG better
Basic RAG has problems when there's too much data. Agentic AI can help fix this.
Agentic AI adds a "reasoner" to RAG. The reasoner doesn't just pull data; it understands the nuances of the person asking, including the question itself and the context.
In a basic RAG system, there’s no clear plan for why certain data is retrieved. Agentic RAG creates a purpose-driven approach to retrieval.
Here's what the reasoner does:
It guesses what the user really wants based on their identity
It makes a plan to find and use the right information
It uses context to understand the relative importance and reliability of various data sources
For example, if someone asks about a recent sales update, the reasoner might prioritize real-time communication tools like Slack or Teams over more static sources like CRM entries. This helps find newer, more useful information.
The reasoner also checks if the information is good before showing it to the user. It tries to use the best sources and avoid unreliable ones.
This means:
- You get more current and relevant answers
- The AI is less likely to use old or wrong information
- You see better results, even when there's lots of data to search through
It’s important to note that agentic RAG can also change its plan if needed. If it doesn’t find good information in one place, it looks somewhere else. If it queries the CRM and finds no recent updates, it can pivot to other sources like call notes or project management tools. This adaptability ensures that users get the most pertinent information available.
Even more crucially, agentic RAG doesn’t just find data — it gets it ready to use. This might mean:
- Translating
- Doing math
- Verifying information
And that’s where agentic “skills” really come into focus.
The generative nature of agentic RAG grants it the ability to independently problem solve. Agentic RAG can sequence and chain together multiple generative actions on its own or within the systems its retrieving information from.
Simply put, agentic RAG can take action on its own. It can even do complex tasks by breaking them into smaller steps.
Let’s go back to another sales example. Imagine that same rep has progressed their deal and is working to close that opportunity they’ve been working. A RAG solution could point them toward sourcing the documents and process descriptions of quote creation. An agentic RAG solution, however, would be capable of drafting that quote with the specifics of that user’s specified account.
The plan might encompass creating a quote draft from the standard quoting template. It also fills in the standard deal fields from CRM data. Baseline costs are generated from the information contained within the most recent pricebook. Let’s also imagine the rep wants to add a discount to the prospect to sweeten the deal — they instruct it to apply a certain percentage discount and AI is capable of calculating and adjusting that quote. Notably, each of these seemingly simple tasks requires multiple steps to achieve.
This is very different from basic RAG. Agentic RAG can do many useful things all at once, making it much more helpful for users.
Upleveling the Moveworks Copilot’s architecture with agentic AI
At Moveworks, our vision for our Copilot was to create a system that truly understands and anticipates user needs. In our architecture, when a user poses a question, the Copilot seamlessly breaks down the process into clear, effective steps:
Understand user goals: The Copilot starts by comprehending the user's query and goals, ensuring it grasps the intent behind the request.
Plan function calls: Next, it strategically plans the necessary function calls required to achieve the user's goals. This step involves mapping out which systems and data sources are best suited to provide the most relevant answers.
Execute relevant function calls: The Copilot then executes these function calls across the targeted systems, retrieving precise information from each source.
Rank and summarize results: Finally, it ranks and summarizes the results obtained, presenting the user with a concise and prioritized overview that directly addresses their query.
We’ve found that an agentic RAG approach not only enhances efficiency but also significantly boosts the quality of results delivered by our Copilot. By integrating agentic AI into our architecture, Moveworks ensures that users receive accurate, contextually aware responses tailored to their specific needs.
It’s time for a smarter approach: Agentic RAG
Agentic RAG is a big step forward from basic RAG. At Moveworks, we’ve seen firsthand how agentic RAG is smarter. Its architecture is set up to think about what information is really needed and take next steps.
Our Copilot doesn’t just surface a lot of data; it finds the right data for each specific question. And this helps businesses make better decisions faster.
Agentics RAG does more than just save time; it changes how AI helps people work. As companies deal with more complex information, we will continue to improve our AI. Our agentic RAG approach shows how we are making AI that really helps people do their jobs better.
Discover how Moveworks’ agentic RAG can transform your operations, streamline workflows, and deliver precise, context-aware insights. Request a demo.
Table of contents