Grounding with outside knowledge
Retrieval augmented generation pairs a model with a search step. Before answering, the system retrieves relevant documents from a knowledge base and places them in the prompt as context. The model then answers using that fresh, specific material rather than memory alone.
The basic flow
- Embed and index your documents so they can be searched by meaning.
- Retrieve the passages most relevant to the user query.
- Augment the prompt with those passages.
- Generate an answer grounded in the supplied text.
Why teams use it
- It supplies current or private information the model never trained on.
- It reduces hallucination by giving the model something to cite.
- It is cheaper to update than retraining the model.
Pitfalls to manage
Retrieval quality caps answer quality, so weak search yields weak answers. Irrelevant passages waste context and can mislead. Always instruct the model to rely on the provided context and to say when the answer is not found there.
Key idea
Retrieval augmented prompting fetches relevant documents and feeds them into the prompt so the model answers from fresh grounded material, with retrieval quality bounding the result.