Two fingerprints are safer than one
A single rolling hash can collide: two different substrings sometimes produce the same fingerprint. When matching is run at scale, or when verifying every hit is too expensive, those collisions become a real concern. Double hashing addresses this by computing two independent hashes for each window using different bases and different moduli.
Why two helps so much
Two substrings are treated as equal only when both fingerprints agree. If each hash collides with some small probability, the chance that two unrelated strings collide on both at once is roughly the product of those probabilities. Multiplying two already small numbers gives a far smaller one, pushing accidental matches to negligible levels.
Practical notes
- Choose large prime moduli so the hash values spread evenly.
- Use distinct bases so the two hashes are not correlated.
- Combine the pair into one comparable key, often by treating it as a coordinate pair.
With double hashing, many implementations skip the direct verification entirely and trust the pair, accepting a vanishingly small error probability in exchange for speed.
Key idea
Double hashing combines two independent rolling hashes and accepts a match only when both agree; because the collision probabilities multiply, false matches become astronomically unlikely.