The Thread Pool Sizing Formula
A thread pool caps how many tasks run at once. Too few threads waste cores; too many waste memory and thrash the scheduler. A useful starting point comes from the classic formula popularized by Brian Goetz.
For CPU bound work the answer is simple. Threads spend nearly all their time computing, so the ideal count is about the number of available cores, sometimes plus one to cover the occasional page fault. Adding more threads just adds context switching.
For IO bound work threads spend much of their time waiting. The formula is:
- threads equals cores times target utilization times one plus wait time over compute time
If a task waits nine units for every one unit it computes, the ratio is nine, so each core can usefully drive about ten threads at full utilization. A pool that ignores this stays idle while requests queue.
Practical cautions:
- Measure Estimate the wait to compute ratio from real traces, not guesses.
- Bound the queue An unbounded queue hides overload; a bounded one surfaces it.
- Separate pools Give blocking IO and CPU work their own pools so one cannot starve the other.
Key idea
Size a pool near core count for compute work, and scale up by the wait to compute ratio for IO work.