← Lessons

quiz vs the machine

Gold1450

Concurrency

Parallel Prefix Sum

Compute every running total at once with an up sweep and a down sweep.

6 min read · core · beat Gold to climb

Parallel Prefix Sum

A prefix sum, also called a scan, replaces each element with the sum of all elements up to and including it. Serially this is one pass, but each output depends on the one before it, which looks stubbornly sequential.

Breaking the dependency

The classic work efficient scan uses two passes over a balanced tree:

  • The up sweep builds partial sums upward, just like a reduction, leaving subtree totals at internal nodes.
  • The down sweep pushes a running offset back down, combining it with the stored partials to produce each prefix.

This performs work proportional to the array length while reaching a depth of only log of the length, so it parallelizes well.

Inclusive versus exclusive

An inclusive scan includes the current element in its output, while an exclusive scan excludes it and starts from an identity value. Many algorithms such as stream compaction and radix sort use exclusive scans to compute output positions.

Key idea

A two phase up sweep and down sweep turns the serial looking scan into a logarithmic depth parallel algorithm with linear work.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the up sweep of a work efficient scan compute?

2. How does an exclusive scan differ from an inclusive scan?

3. What is the depth of a tree based prefix sum on n items?