Tracking disjoint sets
A union find, also called a disjoint set, manages a collection of elements partitioned into non overlapping groups. It answers two questions efficiently: which group does an element belong to, and merge two groups into one. Each group is represented as a tree, and every element points to a parent, with the group's root acting as its identifier.
The two operations
- Find follows parent pointers up to the root, returning the group's identifier. Two elements are in the same group exactly when their roots match.
- Union finds the two roots and links one under the other, merging the groups.
Two optimizations that matter
Naively, the trees can grow tall and slow down find. Two refinements keep them flat:
- Union by rank attaches the shorter tree under the taller one, so the merged tree stays shallow.
- Path compression makes every node visited during a find point straight at the root afterward, flattening the path for future queries.
Together these make find and union run in nearly constant time per operation, fast enough that union find powers connectivity checks and cycle detection in graph algorithms.
Key idea
Union find represents groups as parent pointer trees, where union by rank and path compression keep them shallow so finds and merges run in nearly constant time.