AI Glossary

Retrieval-Augmented Generation

Enhancing LLM outputs by retrieving relevant external documents and including them as context.

Overview

Retrieval-Augmented Generation (RAG) combines large language models with external knowledge retrieval to produce more accurate, up-to-date, and verifiable responses. When a query arrives, relevant documents are retrieved from a knowledge base (using vector search or keyword search), then included in the LLM's context for generation.

Key Details

RAG addresses key LLM limitations: knowledge cutoff dates, hallucination, and inability to access private data. The pipeline involves document chunking, embedding, indexing, retrieval, and augmented generation. Advanced techniques include hybrid search (combining semantic and keyword retrieval), reranking, query decomposition, and iterative retrieval. RAG is the most widely deployed pattern for enterprise LLM applications.

Related Concepts

retrieval augmented generation • vector database • embeddings

← Back to AI Glossary