The choice under overload
When demand exceeds capacity, a service has a choice. It can try to serve everything and collapse, or it can serve some requests well and reject the rest. Graceful overload handling chooses controlled degradation over total failure.
Why trying everything is worst
Accepting more than you can handle leads to congestion collapse: queues grow, latency explodes, timeouts fire, clients retry, and the added retries make it worse. Throughput can drop toward zero exactly when you need it most.
The tools
- Load shedding rejects excess requests early with a clear signal so clients back off.
- Prioritization keeps critical requests while dropping low value ones first.
- Admission control caps concurrent work to a level the system can actually complete.
- Backpressure tells upstream callers to slow down.
A healthy system under overload looks like a flat throughput line with some rejections, not a crash.
Key idea
Shed and prioritize excess load so an overloaded service degrades gracefully instead of collapsing.