A traffic jam of threads
The convoy effect is when many threads repeatedly contend for one heavily used lock, and they end up moving in lockstep like cars in a slow convoy. Even fast threads are paced by the lock handoff rather than by their own work.
How a convoy forms
- A thread holds a hot lock and is descheduled, perhaps for an interrupt or a page fault.
- All other threads block waiting for that lock and pile up in a queue.
- When the holder runs again and releases, threads wake and run briefly, then immediately queue again on the next acquire.
The system spends its time on context switches and lock handoffs, and throughput collapses even though CPUs are available. Unfair wakeups can make threads march in a fixed repeating order.
Reducing convoys
- Shrink critical sections so the lock is held briefly.
- Replace one global lock with finer grained or sharded locks.
- Use lock free structures or read copy update for read heavy paths.
Key idea
A convoy forms when a hot lock paces every thread into lockstep. Shorten critical sections and shard the lock so threads stop queueing behind one another.