← Lessons

quiz vs the machine

Gold1410

System Design

Thundering Herd and Jittered Retries

When clients retry in lockstep they hammer a recovering service, so spread retries with jitter.

5 min read · core · beat Gold to climb

The thundering herd

A thundering herd happens when many clients wake at the same instant and rush the same resource. A cache key expires, or a service restarts, and thousands of requests arrive in one spike that overwhelms the backend.

Why naive retries make it worse

If every failed client retries after exactly the same delay, the retries also arrive together. The service recovers, gets slammed, and falls over again in a self reinforcing loop.

The fixes

  • Exponential backoff grows the wait after each failure, thinning the rate of retries.
  • Jitter adds randomness to each delay so clients no longer line up. Full jitter picks a random wait between zero and the backoff ceiling.
  • Request coalescing lets one client recompute a hot value while others wait on that single result.

A simple rule

  • Backoff alone still synchronizes; jitter is what actually scatters the herd.

Key idea

A thundering herd is synchronized load, and the cure is to desynchronize it with exponential backoff plus random jitter so retries spread out instead of stacking into a new spike.

Check yourself

Answer to earn rating on the learn ladder.

1. What causes a thundering herd?

2. Why is plain exponential backoff not enough?