← Lessons

quiz vs the machine

Gold1480

Concurrency

The Bulkhead Sizing

Isolating resource pools so one slow dependency cannot starve the whole service.

5 min read · core · beat Gold to climb

Compartments that contain damage

A bulkhead isolates resources into separate pools so a failure in one cannot drain the others. The name comes from ship compartments that stop a single breach from flooding the whole hull.

The problem it solves

If all requests share one thread pool and one downstream dependency goes slow, every thread can end up parked waiting on that dependency. The healthy endpoints starve even though their own dependencies are fine. This is a cascading failure through a shared pool.

Sizing the compartments

  • Give each dependency or endpoint its own bounded pool so a slow one can only exhaust its own slice.
  • Size each pool from the dependency's expected concurrency, often using Little law on its own throughput and latency.
  • Leave a small reserve so a partial failure does not consume the entire allocation at once.

The tradeoff is that strict partitioning can leave some pools idle while another is full. A bit of slack or a shared overflow tier balances isolation against efficiency.

Key idea

Bulkheads give each dependency its own bounded pool so one slow caller drains only its slice, trading some idle capacity for failure isolation.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a bulkhead prevent?

2. What is the cost of strict bulkhead partitioning?