Gaussian Elimination
Motivation
Given a system of linear equations \(A\mathbf{x} = \mathbf{b}\), how do you find the solution? The answer taught in every linear algebra course is Gaussian elimination: a systematic procedure of row operations that converts the system into a form where the solution is obvious (Strang 2016). The algorithm is named after Carl Friedrich Gauss, though equivalent methods appear in Chinese mathematics (《九章算術》, Nine Chapters on the Mathematical Art) nearly two thousand years earlier. Gaussian elimination is the foundational algorithm of numerical linear algebra and runs inside every modern solver.
Key Idea
The core insight is that certain operations on the rows of a system leave the solution set unchanged:
- Swap two rows.
- Scale a row by a nonzero constant.
- Add a multiple of one row to another row.
By applying these operations strategically, we simplify the system until the unknowns can be read off directly.
The Augmented Matrix
Rather than rewriting equations repeatedly, encode everything in the augmented matrix \([A \mid \mathbf{b}]\): the coefficient matrix \(A\) with the right-hand-side vector \(\mathbf{b}\) appended as an extra column.
\[ \begin{pmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & & \vdots \\ a_{m1} & \cdots & a_{mn} \end{pmatrix} \mathbf{x} = \begin{pmatrix} b_1 \\ \vdots \\ b_m \end{pmatrix} \quad \longleftrightarrow \quad \left(\begin{array}{ccc|c} a_{11} & \cdots & a_{1n} & b_1 \\ \vdots & & \vdots & \vdots \\ a_{m1} & \cdots & a_{mn} & b_m \end{array}\right). \]
Row operations on \([A \mid \mathbf{b}]\) correspond exactly to the legal transformations above.
The Algorithm
Forward elimination reduces \([A \mid \mathbf{b}]\) to row echelon form (upper triangular with zeros below each pivot):
for each column j from left to right:
find a nonzero entry in column j at or below the current row (the "pivot")
swap that row to the current position
for each row below the pivot row:
subtract (entry / pivot) * pivot_row from that row
advance to the next row
The result is an upper-triangular system. Back substitution then solves for the unknowns from bottom to top.
Worked Example
Solve:
\[ \begin{aligned} x + 2y + z &= 9 \\ 2x + y - z &= 1 \\ 3x + y + z &= 8. \end{aligned} \]
Augmented matrix:
\[ \left(\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 2 & 1 & -1 & 1 \\ 3 & 1 & 1 & 8 \end{array}\right). \]
Step 1: eliminate \(x\) from rows 2 and 3 using row 1 as the pivot row.
\(R_2 \leftarrow R_2 - 2R_1\): \((2-2, 1-4, -1-2, 1-18) = (0, -3, -3, -17)\).
\(R_3 \leftarrow R_3 - 3R_1\): \((3-3, 1-6, 1-3, 8-27) = (0, -5, -2, -19)\).
\[ \left(\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & -3 & -3 & -17 \\ 0 & -5 & -2 & -19 \end{array}\right). \]
Step 2: eliminate \(y\) from row 3 using row 2.
\(R_3 \leftarrow R_3 - \frac{5}{3}R_2\): \((0, -5+5, -2+5, -19+\frac{85}{3}) = (0, 0, 3, \frac{28}{3})\).
\[ \left(\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & -3 & -3 & -17 \\ 0 & 0 & 3 & 28/3 \end{array}\right). \]
Back substitution:
- Row 3: \(3z = 28/3 \implies z = 28/9\).
- Row 2: \(-3y - 3(28/9) = -17 \implies y = 19/9\).
- Row 1: \(x + 2(19/9) + 28/9 = 9 \implies x = 9 - 38/9 - 28/9 = 9 - 66/9 = 15/9 = 5/3\).
What the Echelon Form Reveals
After elimination, the number of pivots (nonzero leading entries) equals the rank of \(A\). This immediately answers the three-outcome question from systems of linear equations:
- \(n\) pivots: unique solution.
- Fewer than \(n\) pivots, last nonzero row of \([A \mid \mathbf{b}]\) is a pivot in \(A\): infinitely many solutions (free variables exist).
- Row of \(A\) is all zeros but corresponding entry of \(\mathbf{b}\) is nonzero: no solution (a contradiction like \(0 = 5\)).
Reduced Row Echelon Form
Going further — also eliminating entries above each pivot and scaling pivots to \(1\) — produces reduced row echelon form (RREF). In RREF, each pivot column is a standard basis vector, and the solution (or parameterization) can be read off directly without back-substitution.
\[ \text{echelon:} \quad \begin{pmatrix} 2 & 4 & 1 \\ 0 & 3 & 2 \\ 0 & 0 & 5 \end{pmatrix} \qquad \text{RREF:} \quad \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \]
Gauss–Jordan Elimination
Running both forward elimination and the upward pass to reach RREF is called Gauss–Jordan elimination (Jordan 1870). Applying it to \([A \mid I]\) (augmenting with the identity matrix) produces \([I \mid A^{-1}]\), computing the matrix inverse when \(A\) is square and invertible.
Complexity
Eliminating an \(n \times n\) system requires \(O(n^3)\) arithmetic operations. For an \(m \times n\) matrix, the cost is \(O(m n^2)\) (assuming \(m \geq n\)). This cubic cost is the bottleneck in many numerical algorithms — it’s why large neural networks do not solve linear systems at every step.
Numerical Considerations
In floating-point arithmetic, the order in which rows are processed matters. A tiny pivot can amplify rounding errors. Partial pivoting — always swapping to bring the largest-magnitude element in the column to the pivot position — keeps errors small and is the standard in production software (Trefethen and Bau 1997).