The Granularity Of Tasks
Granularity is the amount of work in each parallel task. Choosing it well is a balancing act between two opposing failures.
Too fine and too coarse
- Fine grained tasks are small. They expose lots of parallelism and balance load easily, but the per task overhead of creation, scheduling, and synchronization can swamp the useful work.
- Coarse grained tasks are large. They amortize overhead well, but few tasks mean idle cores and poor load balance when sizes vary.
Finding the sweet spot
The goal is tasks large enough that overhead is a small fraction of their work, yet numerous enough to keep every core busy. A common guideline is to create several times more tasks than cores so the scheduler has slack to balance load, while keeping each task well above the overhead threshold.
Adaptive runtimes help: they fork only while the queue might starve and switch to serial execution once enough parallelism exists, effectively tuning granularity at runtime.
Key idea
Granularity trades overhead against parallelism, so aim for tasks large enough to dwarf overhead yet plentiful enough to keep every core busy.