Skip to content

Commit 9950a4f

Browse files
authored
Merge pull request #1298 from tanyagupta01/edit-distance
Create 0072-edit-distance.md
2 parents de53702 + 78bb002 commit 9950a4f

File tree

1 file changed

+166
-0
lines changed

1 file changed

+166
-0
lines changed
Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
---
2+
id: edit-distance
3+
title: Edit Distance(LeetCode)
4+
sidebar_label: 0072-Edit-Distance
5+
tags:
6+
- String
7+
- Dynamic Programming
8+
description: Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.
9+
sidebar_position: 72
10+
---
11+
12+
## Problem Statement
13+
14+
Given two strings `word1` and `word2`, return the minimum number of operations required to convert `word1` to `word2`.
15+
16+
You have the following three operations permitted on a word:
17+
18+
- Insert a character
19+
- Delete a character
20+
- Replace a character
21+
22+
### Examples
23+
24+
**Example 1:**
25+
26+
```plaintext
27+
Input: word1 = "horse", word2 = "ros"
28+
Output: 3
29+
Explanation:
30+
horse -> rorse (replace 'h' with 'r')
31+
rorse -> rose (remove 'r')
32+
rose -> ros (remove 'e')
33+
```
34+
35+
**Example 2:**
36+
37+
```plaintext
38+
Input: word1 = "intention", word2 = "execution"
39+
Output: 5
40+
Explanation:
41+
intention -> inention (remove 't')
42+
inention -> enention (replace 'i' with 'e')
43+
enention -> exention (replace 'n' with 'x')
44+
exention -> exection (replace 'n' with 'c')
45+
exection -> execution (insert 'u')
46+
```
47+
48+
### Constraints
49+
50+
- `0 <= word1.length, word2.length <= 500`
51+
- `word1` and `word2` consist of lowercase English letters.
52+
53+
## Solution
54+
55+
We explore two main approaches: Recursive Dynamic Programming with
56+
Memoization and Iterative Dynamic Programming with Tabulation.
57+
58+
### Approach 1: Recursive Dynamic Programming (Memoization)
59+
Concept: Store the solutions for each position to avoid redundant calculations.
60+
61+
#### Algorithm
62+
63+
For every index of string S1, we have three options to match that index with string S2, i.e replace the character, remove the character or insert some character at that index. Therefore, we can think in terms of string matching path as we have done already in previous questions.
64+
65+
As there is no uniformity in data, there is no other way to find out than to try out all possible ways. To do so we will need to use recursion.
66+
67+
Steps to memoize a recursize solution:
68+
- Create a dp array of size [n][m]. The size of S1 and S2 are n and m respectively, so the variable i will always lie between ‘0’ and ‘n-1’ and the variable j between ‘0’ and ‘m-1’.
69+
- We initialize the dp array to -1.
70+
Whenever we want to find the answer to particular parameters (say f(i,j)), we first check whether the answer is already calculated using the dp array(i.e dp[i][j]!= -1 ). If yes, simply return the value from the dp array.
71+
- If not, then we are finding the answer for the given value for the first time, we will use the recursive relation as usual but before returning from the function, we will set dp[i][j] to the solution we get.
72+
73+
#### Implementation
74+
75+
```C++
76+
int editDistanceUtil(string& S1, string& S2, int i, int j, vector<vector<int>>& dp) {
77+
if (i < 0)
78+
return j + 1;
79+
if (j < 0)
80+
return i + 1;
81+
82+
if (dp[i][j] != -1)
83+
return dp[i][j];
84+
85+
if (S1[i] == S2[j])
86+
return dp[i][j] = 0 + editDistanceUtil(S1, S2, i - 1, j - 1, dp);
87+
88+
else
89+
return dp[i][j] = 1 + min(editDistanceUtil(S1, S2, i - 1, j - 1, dp),
90+
min(editDistanceUtil(S1, S2, i - 1, j, dp),
91+
editDistanceUtil(S1, S2, i, j - 1, dp)));
92+
}
93+
94+
int editDistance(string& S1, string& S2) {
95+
int n = S1.size();
96+
int m = S2.size();
97+
98+
vector<vector<int>> dp(n, vector<int>(m, -1));
99+
100+
return editDistanceUtil(S1, S2, n - 1, m - 1, dp);
101+
}
102+
```
103+
104+
### Complexity Analysis
105+
106+
- **Time complexity**: O(N*M)
107+
Reason: There are N*M states therefore at max ‘N*M’ new problems will be solved.
108+
- **Space complexity**: O(N*M) + O(N+M)
109+
Reason: We are using a recursion stack space(O(N+M)) and a 2D array ( O(N*M)).
110+
111+
### Approach 2: Iterative Dynamic Programming (Tabulation)
112+
113+
Concept: In the recursive logic, we set the base case to `if(i<0)` and `if(j<0)` but we can’t set the dp array’s index to -1. Therefore a hack for this issue is to shift every index by 1 towards the right.
114+
115+
#### Algorithm
116+
117+
1. First we initialise the dp array of size [n+1][m+1] as zero.
118+
2. Next, we set the base condition (keep in mind 1-based indexing), we set the first column’s value as i and the first row as j( 1-based indexing).
119+
3. Similarly, we will implement the recursive code by keeping in mind the shifting of indexes, therefore S1[i] will be converted to S1[i-1]. Same for S2.
120+
4. At last, we will print dp[N][M] as our answer.required.
121+
122+
#### Implementation
123+
124+
```C++
125+
int editDistance(string& S1, string& S2) {
126+
int n = S1.size();
127+
int m = S2.size();
128+
129+
// Create a DP table to store edit distances
130+
vector<vector<int>> dp(n + 1, vector<int>(m + 1, 0));
131+
132+
// Initialize the first row and column
133+
for (int i = 0; i <= n; i++) {
134+
dp[i][0] = i;
135+
}
136+
for (int j = 0; j <= m; j++) {
137+
dp[0][j] = j;
138+
}
139+
140+
// Fill in the DP table
141+
for (int i = 1; i <= n; i++) {
142+
for (int j = 1; j <= m; j++) {
143+
if (S1[i - 1] == S2[j - 1]) {
144+
// If the characters match, no additional cost
145+
dp[i][j] = dp[i - 1][j - 1];
146+
} else {
147+
// Minimum of three choices:
148+
// 1. Replace the character at S1[i-1] with S2[j-1]
149+
// 2. Delete the character at S1[i-1]
150+
// 3. Insert the character at S2[j-1] into S1
151+
dp[i][j] = 1 + min(dp[i - 1][j - 1], min(dp[i - 1][j], dp[i][j - 1]));
152+
}
153+
}
154+
}
155+
156+
// The value at dp[n][m] contains the edit distance
157+
return dp[n][m];
158+
}
159+
```
160+
161+
### Complexity Analysis
162+
163+
- **Time complexity**: $O(N \times M)$
164+
Reason: There are two nested loops
165+
- **Space complexity**: $O(N \times M)$
166+
Reason: We are using an external array of size ‘N*M’. Stack Space is eliminated.

0 commit comments

Comments
 (0)