← Lessons

quiz vs the machine

Platinum1780

Databases

The Gaps and Islands Problem

Finding consecutive runs and the breaks between them with a difference of ranks trick.

6 min read · advanced · beat Platinum to climb

Runs and the breaks between them

The gaps and islands problem asks you to identify consecutive sequences, called islands, and the missing stretches between them, called gaps. Examples include consecutive login days, unbroken ranges of ids, or periods of continuous activity.

The difference of two sequences trick

The classic solution compares two running sequences:

  • Generate ROW NUMBER ordered by the value.
  • Subtract that row number from the value itself, or from a grouping key.
  • Rows in the same island share the same difference, because both sequences advance in lockstep.

When the values are consecutive, the gap between the value and its row number stays constant. A break in consecutiveness shifts that difference, starting a new island.

Grouping by the constant

Once each row carries this constant difference, you GROUP BY it. Each group is one island. MIN and MAX of the value within a group give the island's start and end. To find gaps, use LEAD on the island ends to compare each island's end with the next island's start.

Why it works

Both the actual value and the row number increase by one across consecutive rows, so their difference is invariant within a run and only changes at a discontinuity.

Key idea

Gaps and islands are solved by subtracting a row number from the value; consecutive rows share a constant difference that uniquely labels each island, and grouping by it reveals the runs.

Check yourself

Answer to earn rating on the learn ladder.

1. What makes rows in the same island share a constant value?

2. After computing the difference, how do you isolate each island?

3. How can you locate the gaps between islands?