title: “RAG”

RAG

Retrieval-Augmented Generation (RAG) is a framework used to optimize the output of a Large Language Model (LLM) by retrieving relevant, authoritative information from an external knowledge base to augment the model’s context window.

Traditional RAG: Relies on the retrieval of discrete, often static, document chunks to ground model responses in external data.
LLM Wiki pattern:
- Employs an LLM to autonomously maintain and evolve a structured wiki.
- Focuses on a self-sustaining, continuously updating knowledge architecture rather than reactive retrieval of isolated snippets.
- Reference: 2026 04 10 Karpathys [[concepts/llm-wiki|L
Persistent Memory Augmentation:
- Addresses limitations of traditional RAG by providing AI agents with a persistent, searchable “second brain” for long-term memory retention.
- Example: Gbrain: Open-Source Second Brain for AI Agent Persistent Memory integrates with agents like Hermes Agent]]]]]]]]]]]]]] to maintain state across sessions.

NemoClaw Knowledge Wiki