← Lessons

quiz vs the machine

Platinum1750

System Design

The Thundering Herd Lock

Using a single lock so only one worker rebuilds a value while the rest wait.

5 min read · advanced · beat Platinum to climb

The thundering herd

A thundering herd is when many processes wake up and contend for the same resource at once, only for one to win and the rest to do wasted work. In caching, it appears when a missing key makes hundreds of workers all try to rebuild the same value.

The mutex lock approach

The fix is a distributed lock keyed on the value being rebuilt. The first worker to acquire the lock recomputes the value and writes it to the cache. The others fail to acquire the lock and instead wait for the result or serve a stale copy.

Pitfalls to handle

  • The lock needs a timeout so a crashed holder does not block everyone forever.
  • Waiters need a fallback, such as serving stale data, if the holder is slow.
  • The lock should be fine grained, one per key, not one global lock.

Key idea

A thundering herd lock elects one worker to rebuild a value while the rest wait, but it must use per key locks with timeouts to stay safe.

Check yourself

Answer to earn rating on the learn ladder.

1. What problem does a thundering herd lock solve?

2. Why must the lock have a timeout?