Polynomial hashing
A common string hash treats characters as digits and evaluates the string as a polynomial in some chosen base, taken under a large modulus. This gives each string a single number that can be compared and rolled quickly.
Collisions are unavoidable
Because many strings map into a finite set of values, two different strings can produce the same hash. This is a collision. Comparing only hashes therefore risks a false equality, and adversaries can even craft collisions on purpose if the parameters are public.
Reducing the risk
Several practices push collision probability down:
- Use a large prime modulus so the value space is wide.
- Pick the base randomly at run time so an attacker cannot precompute collisions.
- Use double hashing, combining two independent hashes, which makes an accidental match vastly less likely.
Verify when correctness is critical
In exact algorithms, treat a hash match as a candidate and confirm with a real character comparison. In probabilistic algorithms you accept a tiny, quantifiable failure chance in exchange for speed.
Key idea
Polynomial string hashing maps many strings into finite values so collisions exist, but large random moduli, double hashing, and verification make false matches rare or harmless.