Dot Product

Motivation

Adding vectors and scaling them are purely algebraic operations. But to measure angles between vectors and project one vector onto another, we need a way to multiply two vectors and get a scalar. The dot product (also called the inner product on \(\mathbb{R}^n\)) provides this (Strang 2016). It is the operation behind orthogonality, Gram-Schmidt, PCA projections, attention scores in transformers, and the prediction formula \(\hat y = \mathbf{w}^\top \mathbf{x}\) in linear models.

Definition

The dot product of two vectors \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) is the scalar

\[ \mathbf{u} \cdot \mathbf{v} = \mathbf{u}^\top \mathbf{v} = \sum_{i=1}^n u_i v_i = u_1 v_1 + u_2 v_2 + \cdots + u_n v_n. \]

Both notations \(\mathbf{u} \cdot \mathbf{v}\) and \(\mathbf{u}^\top \mathbf{v}\) are standard. The second emphasizes that this is a row vector times a column vector.

Example

\[ \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} \cdot \begin{pmatrix} 4 \\ 5 \\ 6 \end{pmatrix} = 1 \cdot 4 + 2 \cdot 5 + 3 \cdot 6 = 4 + 10 + 18 = 32. \]

Geometric Interpretation

The dot product has a geometric identity:

\[ \mathbf{u}^\top \mathbf{v} = \|\mathbf{u}\| \, \|\mathbf{v}\| \cos\theta, \]

where \(\theta\) is the angle between the two vectors and \(\|\cdot\|\) denotes the Euclidean norm.

u v θ ‖u‖cosθ (projection) u·v = ‖u‖‖v‖cosθ = (length of v) × (projection of u onto v)

Three cases matter most:

Angle \(\theta\) Dot product Meaning
\(0°\) \(\|\mathbf{u}\|\|\mathbf{v}\|\) (maximum positive) same direction
\(90°\) \(0\) orthogonal (perpendicular)
\(180°\) \(-\|\mathbf{u}\|\|\mathbf{v}\|\) (maximum negative) opposite directions

Orthogonality

Two vectors are orthogonal when their dot product is zero: \(\mathbf{u}^\top \mathbf{v} = 0\). Orthogonality generalizes “perpendicular” to any dimension. It is the key condition in orthogonal bases, Gram-Schmidt, and PCA.

Example

\[ \begin{pmatrix} 1 \\ 0 \end{pmatrix}^\top \begin{pmatrix} 0 \\ 1 \end{pmatrix} = 0, \qquad \begin{pmatrix} 3 \\ -4 \end{pmatrix}^\top \begin{pmatrix} 4 \\ 3 \end{pmatrix} = 12 - 12 = 0. \]

Both pairs are orthogonal.

Properties

For any \(\mathbf{u}, \mathbf{v}, \mathbf{w} \in \mathbb{R}^n\) and \(c \in \mathbb{R}\):

  • Commutativity: \(\mathbf{u}^\top \mathbf{v} = \mathbf{v}^\top \mathbf{u}\)
  • Linearity in first argument: \((c\mathbf{u})^\top \mathbf{v} = c(\mathbf{u}^\top \mathbf{v})\) and \((\mathbf{u} + \mathbf{w})^\top \mathbf{v} = \mathbf{u}^\top \mathbf{v} + \mathbf{w}^\top \mathbf{v}\)
  • Positive definiteness: \(\mathbf{v}^\top \mathbf{v} \geq 0\), with equality iff \(\mathbf{v} = \mathbf{0}\)

Note that \(\mathbf{v}^\top \mathbf{v} = \|\mathbf{v}\|^2\): the dot product of a vector with itself is its squared length.

Projection

A central use of the dot product is projecting one vector onto another. The orthogonal projection of \(\mathbf{u}\) onto the direction of a unit vector \(\hat{\mathbf{q}}\) is

\[ \operatorname{proj}_{\hat{\mathbf{q}}}(\mathbf{u}) = (\hat{\mathbf{q}}^\top \mathbf{u})\, \hat{\mathbf{q}}. \]

The scalar coefficient \(\hat{\mathbf{q}}^\top \mathbf{u}\) is how far \(\mathbf{u}\) extends in the \(\hat{\mathbf{q}}\) direction. When \(\hat{\mathbf{q}}\) is not a unit vector, replace it with \(\mathbf{q}/\|\mathbf{q}\|\).

This formula is what makes Gram-Schmidt work: at each step, you subtract the projection onto each previously constructed orthonormal direction.

Cosine Similarity

Rearranging the geometric formula:

\[ \cos\theta = \frac{\mathbf{u}^\top \mathbf{v}}{\|\mathbf{u}\| \, \|\mathbf{v}\|}. \]

This is called the cosine similarity between \(\mathbf{u}\) and \(\mathbf{v}\). It measures directional agreement independent of vector length. Values range from \(-1\) (opposite) to \(+1\) (same direction), with \(0\) meaning orthogonal.

Cosine similarity is the standard measure of similarity in word embeddings and document retrieval.

The Cauchy-Schwarz Inequality

The geometric formula implies that \(|\cos\theta| \leq 1\), which gives the Cauchy-Schwarz inequality:

\[ |\mathbf{u}^\top \mathbf{v}| \leq \|\mathbf{u}\| \, \|\mathbf{v}\|. \]

Equality holds when \(\mathbf{u}\) and \(\mathbf{v}\) point in the same or opposite directions.

References

Strang, Gilbert. 2016. Introduction to Linear Algebra. 5th ed. Wellesley-Cambridge Press.