← Lessons

quiz vs the machine

Gold1490

Algorithms

Suffix Structures Overview

How suffix tries, trees, arrays, and automata trade space for query power.

6 min read · core · beat Gold to climb

A family that indexes every suffix

Many powerful string algorithms share one idea: precompute a structure over all suffixes of a text so that later pattern queries are fast. Once every suffix is indexed, asking whether a pattern occurs becomes a walk down that structure.

The main members

  • Suffix trie: a tree where each root to node path spells a substring. Conceptually simple but can use space proportional to the square of the text length.
  • Suffix tree: the trie with non branching chains compressed into single edges, shrinking it to linear space while keeping the same queries.
  • Suffix array: the sorted list of suffix starting positions, far more memory friendly and pairing naturally with the LCP array.
  • Suffix automaton: the smallest automaton accepting every substring, compact and elegant for counting and matching.

How they compare

Tries are the easiest to picture but the most wasteful. Trees are fast and expressive but heavier in memory and trickier to build. Arrays sacrifice some direct expressiveness for compactness and cache friendliness, recovering power through the LCP array. Choosing among them is a trade between memory, build complexity, and the kinds of queries you need.

Key idea

Suffix structures all index every suffix so pattern queries become a walk; tries are simplest but wasteful, while trees, arrays, and automata trade build complexity and memory for compact, powerful queries.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the suffix tree relative to the suffix trie?

2. What is a suffix array most valued for?