← Lessons

quiz vs the machine

Platinum1700

System Design

Design Search Autocomplete

Suggest top completions as the user types with very low latency.

6 min read · advanced · beat Platinum to climb

Requirements

  • Return ranked suggestions on each keystroke in tens of milliseconds.
  • Rank by popularity and freshness.
  • Scale to a huge query vocabulary.

High level design

Typing fires a request per prefix, served from a precomputed structure rather than a live scan.

  • Trie: a prefix tree stores terms, and each node caches its top completions so a lookup is a single descent.
  • Aggregation pipeline: a background job counts query frequencies from logs and rebuilds the trie periodically.
  • Serving cache: the precomputed top suggestions per prefix sit in memory for instant reads.

Bottlenecks

  • Latency per keystroke: precompute the top completions at each node so serving avoids ranking at query time.
  • Update freshness: rebuilding the whole trie is costly, so update in batches and merge popular new terms quickly.
  • Scale: shard the trie by prefix across nodes when the vocabulary is too large for one machine.

Debounce on the client and cap suggestions to a handful so the response stays tiny.

Key idea

Autocomplete precomputes ranked completions into a trie so each keystroke is a fast prefix lookup, with a background pipeline keeping rankings fresh.

Check yourself

Answer to earn rating on the learn ladder.

1. Why store the top completions directly at each trie node?

2. How are popularity rankings kept fresh?