The query mismatch
A short question and the document that answers it often look very different in wording. A question is terse and interrogative, while the answer is declarative and detailed. Embedding the raw question may land it far from the passages that actually hold the answer. Hypothetical document embeddings, or HyDE, close this gap.
How HyDE works
Instead of embedding the question, HyDE first asks a language model to write a hypothetical answer to it, then embeds that fake answer and searches with its vector.
- The model drafts a plausible passage as if it knew the answer.
- That draft is embedded, even though it may contain wrong facts.
- The embedding is used to find real passages that look like a good answer.
The invented details do not need to be true. They only need to make the embedding resemble genuine answer text.
Why it works
Search now compares answer shaped text to answer shaped text, which sits closer in embedding space than a question does. Real grounded passages are then retrieved and given to the generator.
Why it matters
HyDE often boosts retrieval on hard or out of domain queries with no change to the index, at the cost of one extra generation step per query.
Key idea
HyDE embeds a model written hypothetical answer rather than the raw question, so search compares answer shaped text to answer shaped text and retrieves real grounding passages more reliably.