Skip to content

Commit 55fbe3e

Browse files
committed
adding finishing touches to gaussian_elimination.md
1 parent 73c9652 commit 55fbe3e

File tree

3 files changed

+103
-23
lines changed

3 files changed

+103
-23
lines changed

SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
* [Computational Mathematics](chapters/computational_mathematics/computational_mathematics.md)
1414
* [Matrix Methods](chapters/computational_mathematics/matrix_methods/matrix_methods.md)
1515
* [Gaussian Elimination](chapters/computational_mathematics/matrix_methods/gaussian_elimination.md)
16+
* [Thomas Algorithm](chapters/computational_mathematics/matrix_methods/thomas.md)
1617
* [FFT](chapters/computational_mathematics/cooley_tukey.md)
1718
* [Computational Physics](chapters/computational_physics/computational_physics.md)
1819
* [Computational Biology](chapters/computational_biology/computational_biology.md)

chapters/computational_mathematics/matrix_methods/gaussian_elimination.md

Lines changed: 101 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ $$
5555
$$
5656

5757

58-
Now. At first, this doesn't seem to help anything, so let's think of this in another way. Wouldn't it be great if the system of equations looked like this:
58+
Now, at first, this doesn't seem to help anything, so let's think of this in another way. Wouldn't it be great if the system of equations looked like this:
5959

6060

6161
$$
@@ -67,7 +67,7 @@ y + 2z &= 2 \\
6767
$$
6868

6969

70-
Then we could just solve for $$z$$ and plug it in to the top two equations to solve for $$x$$ and $$y$$! In matrix form, it would look like this
70+
Then we could just solve for $$z$$ and plug that value in to the top two equations to solve for $$x$$ and $$y$$! In matrix form, it would look like this
7171

7272

7373
$$
@@ -84,9 +84,9 @@ $$
8484
and it has a particular name: _Row Eschelon Form_. Basically, any matrix can be considered in row eschelon form if
8585

8686
1. All non-zero rows are above rows of all zeros
87-
2. The leading coefficient or _pivot_ -- the first non-zero element in every row when reading from left to right is right of the pivot of the row above it.
87+
2. The leading coefficient or _pivot_ (the first non-zero element in every row when reading from left to right) is right of the pivot of the row above it.
8888

89-
Now, Row Eschelon form is nice, but wouldn't it be even better if our system of equations looked simply like this
89+
Now, Row Eschelon Form is nice, but wouldn't it be even better if our system of equations looked simply like this
9090

9191

9292
$$
@@ -112,21 +112,21 @@ $$
112112
$$
113113

114114

115-
And again has a special name **\*Reduced** Row Eschelon Form\*. Now, it seems obvious to point out that if we remove the values after the equals sign \($$=$$\), Row Eschelon Form is an upper triangular matrix, Reduced Row Eschelon Form is diagonal. This might not be important now, but it will play an important role in future discussions, so keep it bussing in the back of your brain.
115+
And again has a special name * **Reduced** Row Eschelon Form*. Now, it seems obvious to point out that if we remove the values to the right of the equals sign \($$=$$\), Row Eschelon Form is an upper triangular matrix, while Reduced Row Eschelon Form is diagonal. This might not be important now, but it will play an important role in future discussions, so keep it buzzing in the back of your brain.
116116

117-
For now, I hope the motivation is clear: we want to convert a matrix into Row Eschelon and Reduced Row Eschelon form to make large systems of equations trivial to solve, so we need some method to do that. What is that method called? \(Hint: It's the title of this section\)
117+
For now, I hope the motivation is clear: we want to convert a matrix into Row Eschelon and (potentially) Reduced Row Eschelon Form to make large systems of equations trivial to solve, so we need some method to do that. What is that method called? \(Hint: It's the title of this section\)
118118

119119
That's right! _Gaussian Elimination_
120120

121121
## The Method
122122

123-
Here I should point out that Gaussian elimination makes sense from a purely analytical point of view. That is to say that for small systems of equations, it's relatively straightforward to do this method by hand; however, for large systems, this \(of course\) become tedious and we will find an appropriate numerical solution. For this reason, I have split this section into two parts. One covers the analytical framework, and the other an algorithm you can write in your favorite programming language.
123+
Here I should point out that Gaussian elimination makes sense from a purely analytical point of view. That is to say that for small systems of equations, it's relatively straightforward to do this method by hand; however, for large systems, this \(of course\) become tedious and we will need to find an appropriate numerical solution. For this reason, I have split this section into two parts. One will cover the analytical framework, and the other will cover an algorithm you can write in your favorite programming language.
124124

125125
In the end, reducing large systems of equations boils down to a game you play on a seemingly random matrix where you have the following moves available:
126126

127127
1. You can swap any two rows
128128
2. You can multiply any row by a non-zero scale value
129-
3. You can add any row to a multiple of any other row.
129+
3. You can add any row to a multiple of any other row
130130

131131
That's it. Before continuing, I suggest you try to recreate the Row Eschelon matrix we made above. That is, do the following:
132132

@@ -144,9 +144,9 @@ $$\left[
144144
\end{array}
145145
\right]$$
146146

147-
There are plenty of different strategies you could use to do this, and no one is better than the rest. Personally, I usually try to multiply each row in the matrix by different values and add rows together until the first column is all the same value, and then I subtract the first row from all subsequent rows. I then do the same thing for the following columns.
147+
There are plenty of different strategies you could use to do this, and no one strategy is better than the rest. Personally, I usually try to multiply each row in the matrix by different values and add rows together until the first column is all the same value, and then I subtract the first row from all subsequent rows. I then do the same thing for the following columns.
148148

149-
After you get an upper triangular matrix, the next step is diagonalizing to create the Reduced Row Eschelon form. In other words, we do the following:
149+
After you get an upper triangular matrix, the next step is diagonalizing to create the Reduced Row Eschelon Form. In other words, we do the following:
150150

151151
$$\left[
152152
\begin{array}{ccc|c}
@@ -168,11 +168,9 @@ Here, the idea is similar to above. You can do basically anything you want. My s
168168

169169
Now, the analytical method may seem straightforward, but the algorithm does not obviously follow from the game we were playing before, so we'll go through it step-by-step.
170170

171-
In general, we go through the following process:
171+
In general, do the following process:
172172

173173
1. For each column `col`, find the highest value
174-
175-
176174
$$
177175
\left[
178176
\begin{array}{ccc|c}
@@ -182,14 +180,69 @@ $$
182180
\end{array}
183181
\right]
184182
$$
183+
2. Swap the row with the highest valued element with the `col`th row.
184+
$$
185+
\left[
186+
\begin{array}{ccc|c}
187+
\mathbf{2} & \mathbf{3} & \mathbf{4} & \mathbf{6} \\
188+
1 & 2 & 3 & 4 \\
189+
\mathbf{3} & \mathbf{-4} & \mathbf{0} & \mathbf{10}
190+
\end{array}
191+
\right]
192+
\rightarrow
193+
\left[
194+
\begin{array}{ccc|c}
195+
\mathbf{3} & \mathbf{-4} & \mathbf{0} & \mathbf{10} \\
196+
1 & 2 & 3 & 4 \\
197+
\mathbf{2} & \mathbf{3} & \mathbf{4} & \mathbf{6}
198+
\end{array}
199+
\right]
200+
$$
201+
3. For all remaining rows, find a fraction that corresponds to the ratio of the lower value in that column to the central pivot \(the one you swapped to the top\)
202+
$$
203+
\rightarrow
204+
\left[
205+
\begin{array}{ccc|c}
206+
3 & -4 & 0 & 10 \\
207+
\mathbf{1} & 2 & 3 & 4 \\
208+
2 & 3 & 4 & 6
209+
\end{array}
210+
\right] \\
211+
\begin{align}
212+
f &= A(\text{pivot}_{\text{row}}, \text{pivot}_{\text{col}}) / A(\text{curr_row}_{\text{row}}, \text{pivot}_{\text{col}}) \\
213+
&= \frac{1}{3}
214+
\end{align}
215+
$$
216+
4. Set all values in the corresponding rows to be the value they were before $$-$$ the top row $$\times$$ the fraction. This is essentially performing move 3 from above, except with an optimal multiplicative factor.
217+
$$
218+
A(\text{curr_row}_{\text{row}}, \text{curr_col}_{\text{col}}) \mathrel{+}= A(\text{pivot_row}_{\text{row}}, \text{pivot_row}_{\text{curr_col}} \times f) \\
219+
\left[
220+
\begin{array}{ccc|c}
221+
3 & -4 & 0 & 10 \\
222+
\mathbf{1} & \mathbf{2} & \mathbf{3} & \mathbf{4} \\
223+
2 & 3 & 4 & 6
224+
\end{array}
225+
\right]
226+
\rightarrow
227+
\left[
228+
\begin{array}{ccc|c}
229+
3 & -4 & 0 & 10 \\
230+
\mathbf{\frac{1}{3}} & \mathbf{\frac{2}{3}} & \mathbf{1} & \mathbf{\frac{4}{3}} \\
231+
2 & 3 & 4 & 6
232+
\end{array}
233+
\right]
234+
$$
235+
5. Set the value of that row's pivot column to 0.
236+
$$
237+
\left[
238+
\begin{array}{ccc|c}
239+
3 & -4 & 0 & 10 \\
240+
0 & 2 & 3 & 4 \\
241+
2 & 3 & 4 & 6
242+
\end{array}
243+
\right]
185244
186-
187-
1. Swap the row with the highest valued element with the `col`th row.
188-
2. For all remaining rows, find a fraction that corresponds to the ratio of the lower value in that column to the central pivot \(the one you swapped to the top\)
189-
3. Set all values in the corresponding rows to be the value they were before $$-$$ the top row $$\times$$ the fraction. This is essentially performing move 3 from above, except with an optimal multiplicative factor.
190-
4. Set the value of that row's pivot column to 0.
191-
192-
ADD VISUALIZATION OF ABOVE
245+
$$
193246

194247
In code, this looks like:
195248

@@ -223,14 +276,39 @@ for k = 1:min(rows,cols):
223276
# Step 4: re-evaluate each element
224277
A[i,j] = A[i,j] - A[k,j]*fraction
225278

226-
# Step 5: Set lower elements to 0
227-
A[i,k] = 0
228279
end
280+
281+
# Step 5: Set lower elements to 0
282+
A[i,k] = 0
229283
end
230284
end
231285
```
232286

233287
As with all code, it takes time to fully absorb what is going on and why everything is happening; however, I have tried to comment the above psuedocode with the necessary steps. Let me know if anything is unclear!
234288

235-
Now, as for what's next... Well, we are in for a treat! The above algorithms clearly has 3 `for` loops, and will thus have a complexity of $$\sim O(n^3)$$, which is abysmal! If we can reduce the matrix to a specifically **tridiagonal** matrix, we can actually solve the system in $$\sim O(n)$$! How? Well, we can use an algorithm known as the _Tri-Diagonal Matrix Algorithm_ \(TDMA\) also known as the _Thomas Algorithm_.
289+
Now, to be clear: this algorithm creates an upper-triangular matrix. In other words, it only creates a matrix in *Row Eschelon Form*, not * **Reduced** Row Eschelon Form*! So what do we do from here? Well, we could create another step to further reduce the matrix, but another method would be to use *Back-Substitution*.
290+
291+
The back-substitution method is precisely what we said above. If we have a matrix in Row-Eschelon Form, we can directly solve for $$z$$, and then plug that value in to find $$y$$ and then plug both of those values in to find $$x$$! Even though this seems straightforward, the pseudocode might not be as simple as you thought!
292+
293+
294+
```python
295+
# Initializing an array of size rows, cols.
296+
# Note this include the right-hand of the set of equations
297+
A(rows, cols}
298+
299+
# This is our solutions vector, of size 'rows'
300+
soln(rows)
301+
302+
# initializing the last element
303+
soln(rows-1)
304+
305+
# Stepping backwards through the solutions vector
306+
for i = rows - 2:-1
307+
for j = rows-1:i
308+
end
309+
end
310+
311+
```
312+
313+
Now, as for what's next... Well, we are in for a treat! The above algorithm clearly has 3 `for` loops, and will thus have a complexity of $$\sim O(n^3)$$, which is abysmal! If we can reduce the matrix to a specifically **tridiagonal** matrix, we can actually solve the system in $$\sim O(n)$$! How? Well, we can use an algorithm known as the _Tri-Diagonal Matrix Algorithm_ \(TDMA\) also known as the _Thomas Algorithm_.
236314

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Thomas Algorithm

0 commit comments

Comments
 (0)