What it recognizes
A suffix automaton is a minimal deterministic machine whose paths from the start state spell exactly the substrings of a given string. Following any substring lands you in some state, and the machine is built so it stays remarkably small.
States group end positions
Each state represents a set of substrings that share the same set of ending positions in the original string. These groups are called end position classes. Substrings in one class behave identically for matching, so they merge into a single state.
Suffix links
States are connected by suffix links that point from a state to the state of its longest suffix lying in a different class. These links form a tree and are the backbone of incremental construction.
- Construction adds one character at a time.
- Each step creates at most a couple of new states.
Why it is small
A striking fact is that the number of states and transitions stays linear in the string length. This compactness lets it count distinct substrings, find longest common substrings, and locate patterns efficiently.
Key idea
A suffix automaton is a minimal machine whose states group substrings by ending positions and are linked by suffix links, recognizing every substring in linear size.