Hashing a window
A rolling hash treats a substring as a number in some base, like reading characters as digits. The key feature is that when the window slides by one character, the new hash is computed from the old one with a few arithmetic steps, instead of rereading the whole window.
The roll step
To move the window forward by one:
- Remove the contribution of the character leaving on the left.
- Shift the remaining value up by one base position.
- Add the contribution of the new character entering on the right.
All arithmetic is done under a modulus to keep numbers bounded. This makes each slide cheap, so scanning a text for a pattern compares hashes at every position in linear time.
Collisions and care
Two different substrings can share a hash, a collision. So a hash match should be confirmed by a direct character comparison, or guarded by using two independent hashes to make accidental matches extremely unlikely. Rolling hashes power substring search, detecting repeated blocks, and comparing many ranges of a string quickly.
Key idea
Update the window hash by removing the old character and adding the new one under a modulus, then verify any hash match directly.