← Lessons

quiz vs the machine

Platinum1820

Algorithms

The Suffix Array

Sort all suffixes of a string to power fast substring searches.

6 min read · advanced · beat Platinum to climb

What it is

A suffix array is the list of all starting positions of a string's suffixes, sorted by the suffixes in dictionary order. It captures the same information as a suffix tree but in a compact array of integers.

Why sorted suffixes help

Because the suffixes are sorted, every occurrence of a pattern appears as a contiguous block in the array. To find a pattern you binary search for the block whose suffixes start with it. This turns substring search into a clean range lookup.

Building it efficiently

A naive build sorts the suffixes directly, which is slow because comparing two suffixes can scan many characters. The standard speedup sorts by doubling:

  • First rank suffixes by their first character.
  • Then by their first two characters, reusing the previous ranks.
  • Then by four, then eight, each round combining two ranks already computed.

After a few doubling rounds the full order is fixed. A companion array of longest common prefixes between neighbors then unlocks many string queries, such as counting distinct substrings.

Key idea

Sort suffixes by doubling the compared length so any pattern's matches form one contiguous, binary searchable block.

Check yourself

Answer to earn rating on the learn ladder.

1. Where do all occurrences of a pattern appear in a suffix array?

2. How does the doubling construction speed up sorting?