← Lessons

quiz vs the machine

Platinum1740

Algorithms

The Longest Common Prefix Array

Recording shared prefixes between neighbours in a sorted suffix list.

6 min read · advanced · beat Platinum to climb

Measuring overlap between sorted suffixes

The longest common prefix array, usually called the LCP array, is a companion to a suffix array. A suffix array lists all suffixes of a string in sorted order. The LCP array stores, for each adjacent pair of suffixes in that sorted order, the length of the longest prefix they share.

Why neighbours matter

Because the suffixes are sorted lexicographically, the most similar suffixes sit next to each other. The shared prefix length between neighbours therefore captures the maximum repetition structure of the string in a compact form. Two suffixes far apart in the array can share at most the minimum LCP value along the stretch between them.

What it unlocks

  • Counting the number of distinct substrings of a string.
  • Finding the longest repeated substring by taking the largest LCP value.
  • Answering longest common prefix queries between arbitrary suffixes via range minimum lookups.

There is also an elegant linear time construction, often called the Kasai method, that builds the LCP array from the suffix array by walking the original string in position order and reusing overlap from the previous step rather than recomputing it.

Key idea

The LCP array stores the shared prefix length between neighbouring sorted suffixes; this compact structure reveals repeats and distinct substring counts and supports overlap queries across the whole string.

Check yourself

Answer to earn rating on the learn ladder.

1. What does each LCP array entry store?

2. How can the longest repeated substring be found from the LCP array?

3. Why do adjacent suffixes give the most useful overlaps?