The Fork Join Framework

The fork join framework parallelizes divide and conquer problems. A task forks into smaller subtasks, those run in parallel, and the parent joins by waiting for them and combining their results. It suits recursive shapes like sorting halves of an array or summing a tree.

The engine is a pool of worker threads, each with its own deque of tasks. A worker pushes new subtasks onto its own deque and pops from there, giving good cache locality. When a worker runs dry it performs work stealing: it takes a task from the tail of another worker's deque. This keeps all cores busy even when the split is uneven.

Effective use depends on a threshold. Splitting all the way to single elements floods the pool with tiny tasks whose overhead dwarfs the work. Below a sequential cutoff a task should just compute directly. Above it, forking pays off.

Fork Create subtasks and schedule them.
Compute Solve the base case directly when small enough.
Join Wait for children and merge.

Fork join shines on CPU bound, easily divisible work. It is a poor fit for tasks that block on IO, since a parked worker cannot steal or be stolen from efficiently.

Key idea

Fork join recursively splits CPU work and uses work stealing to keep cores busy, with a threshold to avoid tiny task overhead.

The Fork Join Framework

The Fork Join Framework

Key idea

Check yourself