The thundering herd
A popular cache entry expires. In the next instant a thousand requests for that same key all miss the cache together and rush to the origin or database at once. This thundering herd can overwhelm the backend right when the cache is least helpful.
Coalescing the requests
Request coalescing, also called single flight, ensures that for a given key only one request actually goes to the backend. The other concurrent requests for the same key wait for that single in flight call and then all share its result.
- The first request for a key starts the real work and registers itself.
- Later requests for the same key find the in flight entry and attach to it.
- When the work completes, every waiter gets the same answer.
Where it helps
Coalescing shines for expensive and identical reads, like a hot config value or a trending page. It does little for unique requests, since there is nothing to collapse.
Key idea
Request coalescing collapses many concurrent requests for the same key into a single backend call that all waiters share, preventing a thundering herd on cache misses.