Parallel Map Reduce
Parallel map reduce splits a collection into chunks, applies a map to each element in parallel, then folds the results with a reduce. It underlies parallel streams, big data engines, and many GPU kernels.
The map step is embarrassingly parallel because each element transforms independently. The reduce step is where care is needed. For safe parallel reduction the combine operation must be associative, meaning grouping does not change the result. Sum and max are associative; subtraction is not. Associativity lets the runtime combine partial results in any tree shape across threads.
A second property, having an identity element such as zero for sum, lets empty chunks fold cleanly and lets the framework start each chunk from a neutral value.
- Map Transform each element with a pure function.
- Combine Merge two partial results associatively.
- Reduce Fold all partials into one answer.
Watch out for hidden state. A reduction that mutates a shared accumulator instead of returning a new value introduces races. Frameworks favor side effect free functions for exactly this reason. When the combine is cheap and elements are many, parallel map reduce scales nearly linearly with cores.
Key idea
Map elements in parallel, then reduce with an associative combine and identity so partial results merge safely in any order.