← Lessons

quiz vs the machine

Gold1420

Algorithms

Double Hashing to Avoid Collisions

Pairing two independent hashes to make accidental string matches astronomically rare.

5 min read · core · beat Gold to climb

Two fingerprints are safer than one

A single rolling hash can collide: two different substrings sometimes produce the same fingerprint. When matching is run at scale, or when verifying every hit is too expensive, those collisions become a real concern. Double hashing addresses this by computing two independent hashes for each window using different bases and different moduli.

Why two helps so much

Two substrings are treated as equal only when both fingerprints agree. If each hash collides with some small probability, the chance that two unrelated strings collide on both at once is roughly the product of those probabilities. Multiplying two already small numbers gives a far smaller one, pushing accidental matches to negligible levels.

Practical notes

  • Choose large prime moduli so the hash values spread evenly.
  • Use distinct bases so the two hashes are not correlated.
  • Combine the pair into one comparable key, often by treating it as a coordinate pair.

With double hashing, many implementations skip the direct verification entirely and trust the pair, accepting a vanishingly small error probability in exchange for speed.

Key idea

Double hashing combines two independent rolling hashes and accepts a match only when both agree; because the collision probabilities multiply, false matches become astronomically unlikely.

Check yourself

Answer to earn rating on the learn ladder.

1. When does double hashing consider two windows equal?

2. Why is the combined collision probability so small?

3. What property should the two moduli have?