How many boxes
Server count estimation divides the load you must serve by what one server can handle, then adds margin.
The core division
- Find the peak QPS you must serve.
- Estimate per server QPS, the rate one machine sustains within latency limits.
- Divide peak by per server capacity to get the count.
If peak is 50000 QPS and one server handles 5000, you need about 10 servers for the work alone.
Add the safety factors
- Redundancy: add machines so the system survives failures, often the N plus one or N plus two rule.
- Headroom: keep utilization below full so latency stays stable.
Running servers near 100 percent utilization makes latency explode, so the final count is always higher than the bare division suggests.
Key idea
Server count is peak QPS divided by per server capacity, then inflated for redundancy and utilization headroom.