# Edit Distance

Jump to navigation
Jump to search

**Edit Distance** is a standard Dynamic Programming problem. Given two strings and , the edit distance between and is the minimum number of operations required to convert string to . The following operations are typically used:

- Replacing one character of string by another character.
- Deleting a character from string
- Adding a character to string

## Brute-Force Approach[edit]

- If characters to be compared are equal i.e. s1[m] = s2[n], then we will compare (m+1)th character of s1 to (n+1)th character of s2 (if both exist).
- If we replace one character of string s1 then we will compare (m+1)th character of s1 to (n+1)th character of s2 (if both exist).
- If we insert one character to string s1 then we will compare mth character of s1 to (n+1)th character of s2 (if both exist).
- If we delete one character from string s1 then we will compare (m+1)th character of s1 to nth character of s2 (if both exist).

Pseudocode for edit distance can be written using recursion easily:

```
int count(string s1, string s2, int m, int n)
{
if (n == s2.length()) return s1.length()-m;
if (m == s1.length()) return s2.length()-n;
if (s1[m] == s2[n]) return count(s1,s2,m+1,n+1);
if (s1[m] != s2[n]) {
return 1+min(min(count(s1,s2,m,n+1), count(s1,s2,m+1,n)), count(s1,s2,m+1,n+1));
}
}
```

## Dynamic Programming Approach[edit]

If we use Dynamic Programming, then we can find the Edit Distance in , here |s1| denote length of string s1 and |s2| denote length of string s2.

Code for Edit Distance using dynamic programming is as follows:

```
int count(string s1, string s2)
{
int m = s1.length();
int n = s2.length();
for (int i = 0; i <= m; i++) {
v[i][0] = i;
}
for (int j = 0; j <= n; j++) {
v[0][j] = j;
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (s1[i-1] == s2[j-1]) v[i][j] = v[i-1][j-1];
else v[i][j] = 1 + min(min(v[i][j-1],v[i-1][j]),v[i-1][j-1]);
}
}
return v[m][n];
}
```

Note that the space requirement can be improved to , as in each iteration only requires element from the current and previous row.

## Variations[edit]

Variations of **Edit Distance** problem includes one where the cost of each operation is not the same.