One Query, Many Workers
A large scan or aggregation can use multiple CPU cores at once. Parallel query execution divides the work among a leader and several worker processes that run pieces of the plan concurrently.
Partitioning the Work
- A parallel scan hands out page ranges to workers so each reads a disjoint slice.
- Workers compute partial results, for example partial aggregates or hashed partitions.
- A gather node in the leader collects worker outputs and combines them.
Some operators are naturally parallel, like a per worker partial sum, while a final combine step merges the partials. The optimizer estimates whether the speedup justifies the startup and coordination cost.
Limits
Parallelism helps large scans but adds overhead, so small queries stay serial. Skewed data can leave one worker doing most of the work, and shared resources like the buffer pool can become a bottleneck, capping the realized speedup.
Key idea
Parallel execution splits a big scan across worker processes that compute partial results, which a gather node combines, but startup cost and data skew limit the gains so small queries stay serial.