The hot key problem
A hot key is a single item so popular that requests for it overwhelm the one shard or cache node that owns it. A viral post or a celebrity profile can concentrate millions of reads onto one key while the rest of the system idles.
When the cache misses
The danger peaks at a cache miss. The moment a hot key expires, thousands of concurrent requests all see the miss and all rush to the database at once. This is a localized stampede.
Request coalescing
Request coalescing, also called single flight, fixes this. When many requests want the same missing key, only the first goes to the backend. The others wait and share its result.
- One in flight fetch per key.
- All concurrent waiters receive the same answer.
- The backend sees one request instead of thousands.
Combine coalescing with techniques like replicating the hot key across nodes and serving slightly stale values to spread and reduce the load further.
Key idea
Request coalescing collapses many simultaneous requests for one hot key into a single backend fetch whose result is shared.