The goal
Users mistype. Spelling correction maps a misspelled query to a likely intended one before retrieval, so a search for resturant still finds restaurants.
Candidate generation
- Edit distance finds words within a few insertions, deletions, or substitutions of the typo.
- A dictionary of valid terms, often built from the index and query logs, supplies candidates.
Picking the best candidate
Generating candidates is not enough; you must rank them. A good scorer combines:
- Likelihood that the candidate is the intended word, from how common it is.
- Closeness to what was typed, from edit distance.
- Context from neighboring words, so the correction fits the phrase.
Did you mean
When confidence is high, the system silently runs the corrected query. When it is moderate, it shows a did you mean suggestion and lets the user choose.
Diagram
Key idea
Spelling correction generates near candidates and ranks them by likelihood, closeness, and context, then auto corrects or suggests based on confidence.