← Lessons

quiz vs the machine

Platinum1780

Algorithms

The String Hashing Collisions

Why two different strings can share a hash, and how to make that almost never matter.

5 min read · advanced · beat Platinum to climb

Polynomial hashing

A common string hash treats characters as digits and evaluates the string as a polynomial in some chosen base, taken under a large modulus. This gives each string a single number that can be compared and rolled quickly.

Collisions are unavoidable

Because many strings map into a finite set of values, two different strings can produce the same hash. This is a collision. Comparing only hashes therefore risks a false equality, and adversaries can even craft collisions on purpose if the parameters are public.

Reducing the risk

Several practices push collision probability down:

  • Use a large prime modulus so the value space is wide.
  • Pick the base randomly at run time so an attacker cannot precompute collisions.
  • Use double hashing, combining two independent hashes, which makes an accidental match vastly less likely.

Verify when correctness is critical

In exact algorithms, treat a hash match as a candidate and confirm with a real character comparison. In probabilistic algorithms you accept a tiny, quantifiable failure chance in exchange for speed.

Key idea

Polynomial string hashing maps many strings into finite values so collisions exist, but large random moduli, double hashing, and verification make false matches rare or harmless.

Check yourself

Answer to earn rating on the learn ladder.

1. Why are hash collisions inevitable for string hashing?

2. Which practice most reduces an adversary's ability to force collisions?

3. How should exact algorithms treat a hash match?