The Thundering Herd
A popular key sits in cache absorbing huge traffic. The instant it expires, every concurrent request misses at once and all of them rush the origin to recompute the same value. This cache stampede can overload a database that was perfectly fine a moment earlier.
Defenses
- Request collapsing lets the first miss acquire a lock and fetch, while other requests for the same key wait and share the single result instead of each hitting the origin.
- Probabilistic early expiry refreshes a hot key slightly before its TTL, spreading recomputation over time rather than at one instant.
- Locking with a stale fallback serves the old value to waiters while one worker recomputes, blending well with stale while revalidate.
Why Collapsing Works
The expensive work is computed once per key no matter how many callers want it. Without collapsing, a thousand simultaneous misses become a thousand origin calls; with it they become one call and nine hundred ninety nine cheap waits.
Key idea
A stampede turns one expiry into a flood of identical origin calls, and request collapsing fixes it by computing the value once and sharing it with all waiters.