Raw questions are noisy
Users rarely phrase questions the way documents are written. A query may carry chit chat, pronouns that refer to earlier turns, or several intents at once. Query rewriting uses a language model to transform that raw input into a clean, self contained query that retrieves better.
What rewriting does
- Resolve references so that it or that becomes the actual entity from the conversation.
- Strip filler so only the searchable intent remains.
- Add context from the chat history that the embedding needs to find the right passage.
In a chat setting the latest message alone is often meaningless without earlier turns, so rewriting folds that history into one standalone question.
Where it sits
Rewriting runs between the user and the retriever. The cleaned query feeds search, while the original message can still guide the final answer's tone.
Why it matters
Retrieval quality is capped by query quality. A vague or context free query cannot match the right chunk no matter how good the index is. Rewriting fixes the input before any embedding happens, which is cheaper and more reliable than patching results afterward.
Key idea
Query rewriting turns a noisy context dependent user message into a clean standalone query, lifting retrieval quality before any embedding or search occurs.