Orthogonal Bases
Motivation
A general basis lets us assign coordinates to every vector, but finding those coordinates requires solving a linear system: given \(\mathbf{v}\), we have to solve \(B\mathbf{c} = \mathbf{v}\) for the coefficients \(\mathbf{c}\). This can be expensive, and the resulting coordinates have no special geometric meaning.
An orthogonal basis is a basis whose vectors are mutually perpendicular (Trefethen and Bau 1997). With an orthogonal basis, finding the coordinates of any vector collapses to a sequence of inner products — no linear system needed. Coordinates become meaningful in their own right: they are the lengths of the projections onto each basis direction. Orthogonal bases are the foundation of least-squares regression, principal components analysis, Fourier series, and the singular value decomposition.
Definition
A set of vectors \(\mathbf{q}_1, \ldots, \mathbf{q}_n \in \mathbb{R}^n\) is an orthogonal basis of \(\mathbb{R}^n\) if it is a basis and the vectors are pairwise orthogonal:
\[ \mathbf{q}_i^\top \mathbf{q}_j = 0 \quad \text{for all } i \neq j. \]
If in addition each \(\mathbf{q}_i\) has unit length, \(\|\mathbf{q}_i\| = 1\), the basis is orthonormal:
\[ \mathbf{q}_i^\top \mathbf{q}_j = \delta_{ij} = \begin{cases} 1 & i = j \\ 0 & i \neq j. \end{cases} \]
Every orthogonal basis can be normalized into an orthonormal one by dividing each vector by its norm.
Standard basis
The standard basis \(\mathbf{e}_1, \ldots, \mathbf{e}_n\) is orthonormal: \(\mathbf{e}_i^\top \mathbf{e}_j = \delta_{ij}\) directly from the entries.
A rotated orthonormal basis of \(\mathbb{R}^2\)
\[ \mathbf{q}_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, \quad \mathbf{q}_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. \]
Check: \(\mathbf{q}_1^\top \mathbf{q}_2 = \tfrac{1}{2}(1 - 1) = 0\), and \(\|\mathbf{q}_1\| = \|\mathbf{q}_2\| = 1\). These vectors are the standard basis rotated by \(45°\).
Orthogonality Implies Independence
Any set of nonzero pairwise-orthogonal vectors is automatically linearly independent. Suppose
\[ c_1 \mathbf{q}_1 + \cdots + c_n \mathbf{q}_n = \mathbf{0}. \]
Take the inner product of both sides with \(\mathbf{q}_i\):
\[ c_i \|\mathbf{q}_i\|^2 = 0, \]
using \(\mathbf{q}_i^\top \mathbf{q}_j = 0\) for \(j \neq i\). Since \(\|\mathbf{q}_i\|^2 > 0\), we get \(c_i = 0\). This holds for every \(i\), so the set is independent.
This means orthogonality is a stronger condition than independence — any \(n\) pairwise-orthogonal nonzero vectors in \(\mathbb{R}^n\) are automatically a basis. There is no need to separately verify spanning.
Coordinates Are Inner Products
The defining advantage of an orthogonal basis: coordinates are computed by inner products instead of by solving a linear system.
If \(\mathbf{q}_1, \ldots, \mathbf{q}_n\) is an orthogonal basis of \(\mathbb{R}^n\) and \(\mathbf{v} \in \mathbb{R}^n\), then
\[ \mathbf{v} = \sum_{i=1}^n \frac{\mathbf{q}_i^\top \mathbf{v}}{\|\mathbf{q}_i\|^2}\, \mathbf{q}_i. \]
If the basis is orthonormal, \(\|\mathbf{q}_i\| = 1\) and the formula simplifies to
\[ \mathbf{v} = \sum_{i=1}^n (\mathbf{q}_i^\top \mathbf{v})\, \mathbf{q}_i. \]
Why. Writing \(\mathbf{v} = c_1 \mathbf{q}_1 + \cdots + c_n \mathbf{q}_n\) and taking the inner product with \(\mathbf{q}_i\) kills every term except the \(i\)-th, leaving \(\mathbf{q}_i^\top \mathbf{v} = c_i \|\mathbf{q}_i\|^2\). Solve for \(c_i\).
The geometric content is that each coordinate is the (signed) length of the projection of \(\mathbf{v}\) onto the corresponding basis direction. The basis vectors do not interact when extracting coordinates — orthogonality is exactly the absence of interaction.
Example
Use the rotated basis \(\mathbf{q}_1, \mathbf{q}_2\) from above to find the coordinates of \(\mathbf{v} = (3, 1)^\top\):
\[ \mathbf{q}_1^\top \mathbf{v} = \tfrac{1}{\sqrt{2}}(3 + 1) = \tfrac{4}{\sqrt{2}} = 2\sqrt{2}, \]
\[ \mathbf{q}_2^\top \mathbf{v} = \tfrac{1}{\sqrt{2}}(3 - 1) = \tfrac{2}{\sqrt{2}} = \sqrt{2}. \]
So \(\mathbf{v} = 2\sqrt{2}\, \mathbf{q}_1 + \sqrt{2}\, \mathbf{q}_2\). No linear system was solved — only two inner products. Compare with the general-basis example in bases and coordinates, which required solving a \(2 \times 2\) system.
Orthogonal Matrices
Stacking an orthonormal basis as the columns of a matrix \(Q = [\mathbf{q}_1 \;\cdots\; \mathbf{q}_n] \in \mathbb{R}^{n \times n}\) produces an orthogonal matrix, characterized by
\[ Q^\top Q = I, \]
equivalently \(Q^{-1} = Q^\top\). The inverse of an orthogonal matrix is its transpose — no Gaussian elimination required.
Orthogonal matrices preserve lengths and angles:
\[ (Q\mathbf{x})^\top (Q\mathbf{y}) = \mathbf{x}^\top Q^\top Q \mathbf{y} = \mathbf{x}^\top \mathbf{y}, \qquad \|Q\mathbf{x}\| = \|\mathbf{x}\|. \]
They represent rigid motions of \(\mathbb{R}^n\) — rotations and reflections.
Orthogonal Projections
If \(W \subseteq \mathbb{R}^n\) is a subspace with orthonormal basis \(\mathbf{q}_1, \ldots, \mathbf{q}_k\), the orthogonal projection of any \(\mathbf{v} \in \mathbb{R}^n\) onto \(W\) is
\[ \operatorname{proj}_W(\mathbf{v}) = \sum_{i=1}^k (\mathbf{q}_i^\top \mathbf{v})\, \mathbf{q}_i. \]
This is the closest point in \(W\) to \(\mathbf{v}\) in Euclidean distance. The residual \(\mathbf{v} - \operatorname{proj}_W(\mathbf{v})\) is orthogonal to every vector in \(W\).
Without an orthonormal basis of \(W\), computing the projection requires solving a normal-equations system \(A^\top A \mathbf{c} = A^\top \mathbf{v}\), which is exactly the cost an orthonormal basis avoids.
Constructing an Orthonormal Basis
Most subspaces do not come with an orthonormal basis already chosen. The standard way to build one from an arbitrary basis is the Gram-Schmidt algorithm, which iteratively subtracts the projections onto previously orthonormalized vectors. The result is an orthonormal basis spanning the same subspace as the input.
The singular value decomposition provides another route, yielding orthonormal bases of the row space, column space, kernel, and cokernel of any matrix simultaneously.