Sorting without data dependent branches
Bitonic sort is a sorting network: its sequence of compare and swap operations is fixed in advance and does not depend on the data values. That predictability is exactly what parallel hardware and GPUs want, because every lane runs the same instructions.
Bitonic sequences
A bitonic sequence first increases then decreases, or is a rotation of such a pattern. The algorithm builds bitonic sequences and then merges them:
- Recursively sort the first half ascending and second half descending to form a bitonic sequence.
- Bitonic merge: compare elements a fixed distance apart and swap them into order, halving the distance each step.
- Repeat until the whole sequence is sorted.
Cost and parallelism
The network has a logarithmic number of stages, and each stage itself has a logarithmic number of steps, giving depth proportional to the square of the log of the input. Total work is larger than an optimal comparison sort, but every compare in a step is independent, so all of them run at once.
Key idea
Bitonic sort trades extra total comparisons for a fixed data independent structure, letting a sorting network run thousands of independent compare and swaps per step on parallel hardware.