The average is a trap
A system that handles its average load comfortably can still fall over at peak. Capacity is decided by the busiest moment, not the daily mean.
Where peaks come from
- Diurnal cycles: traffic concentrates in waking hours.
- Events: launches, sales, or breaking news spike demand.
- Batch jobs that fire on a schedule add bursts.
A common shorthand is a peak to average ratio. If most of a day's traffic lands in a few busy hours, the peak can be several times the average.
Designing for the peak
- Size servers and connection pools for peak QPS.
- Use autoscaling so capacity follows demand and you do not pay for the peak all day.
Provisioning only for the average guarantees an outage exactly when the system is most valuable, during its busiest window.
Key idea
Capacity is set by peak load, found by multiplying the average by a peak to average ratio, then served with autoscaling.