https://youtu.be/qreMmsOY86A Here is a comprehensive Markdown document based on the video presentation by David Jones-Gilardi, Developer Relations Engineer at IBM.
What is OpenRAG? An Overview of Agentic RAG Systems
Speaker: David Jones-Gilardi, Developer Relations Engineer, IBM
Introduction: The State of Generative AI Context Windows
As Generative AI (GenAI) models have matured, their context windows have become massive. This has led to discussions in the tech community suggesting that RAG (Retrieval-Augmented Generation) might no longer be necessary. However, even if context windows were virtually infinite, RAG remains extremely relevant for three main reasons when dealing with AI systems:
- Cost: If you use an internet-based model provider, you pay by the token. Injecting massive amounts of data into a prompt for every task is highly expensive.
- Performance (Latency): Processing a massive context window takes significantly more time.
- Accuracy: While LLMs are improving, their accuracy still cannot compare to the results produced when models are given exactly the targeted information they need.
What is RAG?
Retrieval-Augmented Generation (RAG) is a method used to inject external information into a model at runtime—specifically, information the model wasn’t trained on. This is especially useful for:
- Domain-specific knowledge
- Protected/Private information (data that cannot be scraped by public models)
Enter OpenRAG
To build an effective agentic RAG system from scratch, you need three core components:
- Quality data ingestion
- Excellent hybrid search (for fast retrieval)
- An orchestration layer to tie it all together
OpenRAG is an open-source platform of tightly integrated tools that makes standing up an effective, pre-configured agentic RAG system straightforward. It is built on top of three major platforms:
1. Docling (Data Ingestion)
Docling handles intelligent document ingestion. When you ingest a complex document like a PDF, it typically contains tables, images, and various text layouts. Docling identifies these components and extracts them in a way that is optimized for LLMs and agents. Without this, you risk feeding “junk data” to your models, drastically reducing the accuracy and efficacy of your RAG agents.
2. OpenSearch (Hybrid Search & Retrieval)
OpenSearch is a leading open-source search platform. Once documents are processed by Docling, they are sent to OpenSearch and stored as vector representations. This optimizes the data for incredibly fast search retrieval.
3. Langflow (Orchestration)
Langflow provides the agentic and RAG foundation. It serves as the wiring and execution AI workflow engine, providing connectivity to dozens of models and vector store providers. Everything in OpenRAG is built upon Langflow.
Using OpenRAG
Once OpenRAG is installed, you can immediately begin interacting with your data through a simple user interface.
- Ingestion: You can ingest tons of document types into your OpenRAG knowledge base. You can also have the agent ingest URLs on the fly based on your conversations. All data is automatically processed with Docling and OpenSearch.
- Querying: Once the knowledge is in the system, you can ask questions in the chat interface. You can search across your entire corpus of knowledge or use filters to target specific document groups.
Customizing Your RAG System with Langflow
If you want to modify your RAG system—such as changing the model provider, swapping embedding models, or altering how OpenRAG manages data—you can do this directly through the Langflow Studio UI.
Example: Adding an External Data Source
If you want an agent to utilize both your OpenSearch vector database and a brand-new external data source, you can map this out visually in Langflow:
- Map the Flow: You define the inputs and outputs, placing an Agent in the center.
- Assign Tools: You connect the OpenSearch component to the Agent as a tool. Then, you bring in your external data source and connect it as a second tool.
- Define Tool Descriptions: Crucial step: You must ensure the name and description of these tools are unique and highly descriptive. This metadata is exactly what the Agent reads to determine which data source to pull from when a user asks a question.
Once you modify the workflow in Langflow, your OpenRAG chat interface will immediately be able to query both the internal OpenSearch data and the new external data source.
Custom Applications
OpenRAG allows for deep customization. You can:
- Modify the Langflow workflow to change how data is processed before it hits OpenSearch.
- Use the OpenRAG UI as a reference to build your own custom applications.
- Directly utilize the Langflow API to build entirely bespoke applications on top of your configured RAG logic.
Conclusion
OpenRAG is designed so developers can stand up a complete, effective RAG platform in minutes rather than starting from scratch. Because it is fully open-source, it affords complete control and flexibility to manipulate every part of the data pipeline to suit specific enterprise needs.
Related Concepts
- RAG (Retrieval-Augmented Generation) — Wikipedia
- Generative AI — Wikipedia
- Context Windows — Wikipedia