Requirements
- Given a prefix, return the top ranked completions quickly.
- Suggestions are ranked by popularity and updated over time.
- Latency must be a few milliseconds per keystroke.
High level design
Precompute the top suggestions for each prefix and serve from memory.
- Trie: a prefix tree where each node stores the top k completions for that prefix, computed offline.
- Ranking: a batch job aggregates query frequencies and writes the top k per node.
- Serving: an in memory trie or a key value store keyed by prefix returns suggestions instantly.
Bottlenecks
- Per keystroke load: every character is a request, so responses come from memory and are debounced on the client.
- Freshness: rankings update in batches rather than per query to keep serving cheap.
- Storage: caching top k per prefix bounds memory rather than storing all completions.
Tradeoffs
- Precomputing top k per prefix gives instant reads but delays freshness.
- A full trie is memory heavy, so popular prefixes can be cached and rare ones computed on demand.
Key idea
Typeahead precomputes the top k completions per prefix offline and serves them from an in memory trie so each keystroke is a millisecond lookup.