Edit Distance
The edit distance, also called Levenshtein distance, is the minimum number of single character edits needed to transform one string into another. The allowed operations are insert, delete, and replace.
Dynamic programming
Use a grid where rows represent prefixes of the first string and columns represent prefixes of the second:
- If the current characters are equal, the cost equals the diagonal cell with no extra operation.
- Otherwise take one plus the minimum of the three neighbors, representing replace from the diagonal, delete from above, and insert from the left.
The bottom right cell gives the total distance. The first row and column are seeded with increasing values because turning a string into an empty string costs one deletion per character.
Where it is used
Spell checkers suggest corrections by ranking words with small edit distance. The same idea powers fuzzy search and bioinformatics alignment scoring.
Key idea
Each cell stores the cheapest way to align two prefixes using insert delete and replace.