The thundering herd
A thundering herd happens when many clients wake at the same instant and rush the same resource. A cache key expires, or a service restarts, and thousands of requests arrive in one spike that overwhelms the backend.
Why naive retries make it worse
If every failed client retries after exactly the same delay, the retries also arrive together. The service recovers, gets slammed, and falls over again in a self reinforcing loop.
The fixes
- Exponential backoff grows the wait after each failure, thinning the rate of retries.
- Jitter adds randomness to each delay so clients no longer line up. Full jitter picks a random wait between zero and the backoff ceiling.
- Request coalescing lets one client recompute a hot value while others wait on that single result.
A simple rule
- Backoff alone still synchronizes; jitter is what actually scatters the herd.
Key idea
A thundering herd is synchronized load, and the cure is to desynchronize it with exponential backoff plus random jitter so retries spread out instead of stacking into a new spike.