Systems of Linear Equations

Motivation

Many problems reduce to finding unknown values that simultaneously satisfy several constraints, each of which is a linear equation. A robot localization problem might constrain coordinates from multiple sensor readings; a recommendation system might constrain user preferences from observed ratings; a circuit might constrain voltages from Kirchhoff’s laws. All of these are systems of linear equations (Strang 2016). Understanding when solutions exist, how many there are, and how to find them is foundational for machine learning and applied mathematics.

Definition

A system of \(m\) linear equations in \(n\) unknowns has the form

\[ \begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n &= b_1 \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n &= b_2 \\ &\;\vdots \\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n &= b_m. \end{aligned} \]

The \(a_{ij}\) are known coefficients, the \(b_i\) are known right-hand sides, and \(x_1, \ldots, x_n\) are the unknowns.

Matrix form. Stacking everything into a matrix \(A \in \mathbb{R}^{m \times n}\) and vectors \(\mathbf{x} \in \mathbb{R}^n\), \(\mathbf{b} \in \mathbb{R}^m\):

\[ A\mathbf{x} = \mathbf{b}. \]

This compact notation is not just convenient — it reveals that solving a linear system means finding a vector \(\mathbf{x}\) whose image under the linear map \(A\) is \(\mathbf{b}\).

Examples

One equation, one unknown

\[ 3x = 6 \implies x = 2. \]

Exactly one solution.

Two equations, two unknowns

\[ \begin{pmatrix} 1 & 1 \\ 2 & -1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 5 \\ 4 \end{pmatrix}. \]

The first equation is \(x + y = 5\), the second is \(2x - y = 4\). Adding them: \(3x = 9\), so \(x = 3\), \(y = 2\). Unique solution.

Geometrically

Each equation in \(\mathbb{R}^2\) defines a line. The solution set is the intersection:

unique solution (3,2) x+y=5 2x-y=4

no solution x+y=5 x+y=2 parallel — never meet

infinitely many same line — overlap

In \(\mathbb{R}^3\) each equation defines a plane; solutions are intersections of planes.

Three Possible Outcomes

For any system \(A\mathbf{x} = \mathbf{b}\), exactly one of three things is true:

  1. Unique solution. Exactly one \(\mathbf{x}\) satisfies all equations. (The “square” case: \(m = n\) and the equations are independent.)
  2. No solution. The equations are contradictory — inconsistent. (E.g., \(x = 1\) and \(x = 2\) simultaneously.)
  3. Infinitely many solutions. One equation is redundant; the solution set is a line, plane, or higher-dimensional subspace.

Which case applies depends on the rank of \(A\) relative to the rank of the augmented matrix \([A \mid \mathbf{b}]\).

Condition Outcome
\(\operatorname{rank}(A) = \operatorname{rank}([A\mid\mathbf{b}]) = n\) unique solution
\(\operatorname{rank}(A) = \operatorname{rank}([A\mid\mathbf{b}]) < n\) infinitely many
\(\operatorname{rank}(A) < \operatorname{rank}([A\mid\mathbf{b}])\) no solution

The Homogeneous Case

When \(\mathbf{b} = \mathbf{0}\), the system \(A\mathbf{x} = \mathbf{0}\) is homogeneous. It always has at least the trivial solution \(\mathbf{x} = \mathbf{0}\). If \(A\) has more columns than rows (\(n > m\)), the system must have infinitely many solutions (the null space is non-trivial).

The solution set of \(A\mathbf{x} = \mathbf{0}\) is the null space of \(A\), a subspace of \(\mathbb{R}^n\).

The General Solution

If \(\mathbf{x}_p\) is any particular solution to \(A\mathbf{x} = \mathbf{b}\), and \(\mathbf{x}_h\) is any solution to the homogeneous system \(A\mathbf{x} = \mathbf{0}\), then \(\mathbf{x}_p + \mathbf{x}_h\) is also a solution to \(A\mathbf{x} = \mathbf{b}\). The complete solution set is

\[ \{\mathbf{x}_p + \mathbf{x}_h : \mathbf{x}_h \in \ker(A)\}. \]

This structure — particular solution plus null space — reappears in differential equations, physics, and optimization.

Overdetermined and Underdetermined Systems

Overdetermined (\(m > n\)): more equations than unknowns. Typically no exact solution exists. The practical fix is to find the \(\mathbf{x}\) that minimizes \(\|A\mathbf{x} - \mathbf{b}\|^2\) — this is linear regression (ordinary least squares).

Underdetermined (\(m < n\)): fewer equations than unknowns. Typically infinitely many solutions. The practical fix is to add a regularizer and find the minimum-norm solution — this is ridge regression.

Solving Systems

The standard algorithm for solving \(A\mathbf{x} = \mathbf{b}\) is Gaussian elimination, which converts \(A\) to row-echelon form through systematic row operations. For square, invertible \(A\), the solution is

\[ \mathbf{x} = A^{-1}\mathbf{b}, \]

but computing the inverse is less efficient than elimination for large systems.

References

Strang, Gilbert. 2016. Introduction to Linear Algebra. 5th ed. Wellesley-Cambridge Press.