Why headroom exists
Capacity headroom is the gap between current load and the point where a system degrades. Headroom absorbs traffic spikes, the loss of a node, and the warm up lag of autoscaling.
Finding the limit with load testing
Load testing drives synthetic traffic to learn how a system behaves under stress.
- Load test confirms it meets targets at expected peak.
- Stress test pushes past the peak to find the breaking point.
- Soak test runs long to expose leaks and slow degradation.
Reading the knee
As load rises latency stays flat, then climbs sharply at the knee where queues build. You want to run below the knee, not on it.
Setting headroom
- Spike absorption: leave room for short bursts above average.
- Failure tolerance: if losing one node of four must be survivable, run each node under seventy five percent.
Key idea
Load testing reveals the knee where latency explodes, and capacity headroom is the deliberate slack you keep below it so spikes, node loss, and scaling lag do not tip the system over.