Retrieval-augmented generation (RAG) has been one of the most popular strategies for improving the accuracy and contextualization of language models. But in 2025, we're no longer just talking about "retrieving" content. We're talking about complex systems where multiple agents collaborate, divide tasks, and generate more useful responses. Welcome to the era of MA-RAG.
What was RAG and why did it work?
RAG, in its traditional form, involves improving AI responses by using an external knowledge base (such as a vector database). Instead of generating text from scratch, the model accesses relevant documents and responds with that information in mind.
This solved two problems: forgetting the templates (which aren't always up-to-date) and hallucination (made-up answers without a source). Tools like ChatGPT Enterprise and Claude Pro integrated this capability starting in 2023 with positive results.
MA-RAG: when each agent has a role
Now, the field is moving towards multi-agent architectures within the RAG system itself. The article “Multi-Agent RAG: Let Experts Collaborate for Retrieval-Augmented Generation” (2024) details how the different steps of the process can be distributed among agents:
- The "strategist" agent analyzes the question and plans the search steps.
- The "search" agent performs searches and filters the best documents.
- The "reader" agent synthesizes responses based on the documents.
- The "verifying" agent reviews and rewrites to ensure consistency and coherence.
This approach improves the accuracy of the results and allows for more reliable answers, with a lower risk of hallucination.
Why is this important?
Because this model mimics the way humans work : delegating specific tasks based on skills. Furthermore, it allows systems to scale without increasing the size of a single model.
Open source is also embracing this model: projects like LangGraph , AutoGen Studio or CrewAI allow you to easily create multi-agent workflows with tools like LangChain.
👉 Are you interested in this approach? Also read our article on what RAG (Retrieval-Augmented Generation) is and why it matters for the future of AI.