Simplex Method

Motivation

The simplex method (Dantzig 1963) is the most widely used algorithm for solving linear programs. It works by exploiting a key geometric fact: the optimal value of any LP, if finite, is achieved at a vertex of the feasible polytope. Simplex starts at one vertex and walks along edges to neighboring vertices, always improving the objective, until no improving neighbor exists.

Understanding simplex matters beyond just “how LP solvers work.” Warm starts, pivot rules, degeneracy, and formulation choices all affect practical solver performance. Branch-and-bound methods for integer programming rely on repeated LP solves that warm-start simplex from a previous basis, making simplex — not interior-point methods — the algorithm of choice inside most ILP solvers.

Problem

Given the standard-form linear program

\[ \max c^\top x \quad \text{s.t.} \quad Ax \le b, \quad x \ge 0, \]

find \(x\) achieving the maximum (if finite), or report unboundedness or infeasibility.

Key Ideas

Slack form: equalities instead of inequalities

Introduce one slack variable \(s_i \ge 0\) per constraint, converting each inequality to an equality:

\[ Ax + s = b, \quad x \ge 0, \quad s \ge 0. \]

The combined variable vector \((x, s) \in \mathbb{R}^{n+m}\) has \(n + m\) entries, and there are \(m\) equality constraints. The change is mechanical — slack form has the same feasible region as the original — but makes the linear-algebra steps cleaner.

A basic feasible solution is a vertex

A basis \(B \subseteq \{1, \ldots, n+m\}\) is a set of \(m\) indices such that the corresponding columns of \([A \mid I]\) are linearly independent. The basic feasible solution (BFS) for \(B\) sets all non-basic variables (indices not in \(B\)) to zero and solves the resulting \(m \times m\) system for the \(m\) basic variables. If the solution is non-negative, it is a BFS.

Every vertex of the feasible polytope corresponds to at least one BFS. Since the number of BFS is at most \(\binom{n+m}{m}\) (finite), any algorithm that moves between BFS without repeating will terminate. This is what turns the geometric “walk along edges” into a finite combinatorial procedure: each pivot exchanges one basic variable for one non-basic, and the algorithm needs to visit only finitely many bases.

Reduced costs as the optimality certificate

For a basis \(B\) with basis matrix \(\mathbf{B}\) (the \(m\) columns of \([A \mid I]\) indexed by \(B\)), the reduced cost of non-basic variable \(j\) is

\[ \bar c_j = c_j - c_B^\top \mathbf{B}^{-1} a_j, \]

where \(a_j\) is column \(j\) of \([A \mid I]\) and \(c_B\) are the objective coefficients of the basic variables. Reduced cost measures how much the objective improves per unit increase in \(x_j\).

If \(\bar c_j \le 0\) for every non-basic \(j\), the current BFS is optimal — no improving direction exists at this vertex. Otherwise, some non-basic variable with positive reduced cost can enter the basis, and the corresponding edge of the polytope improves the objective.

Algorithm

find an initial basic feasible solution
while some non-basic variable j has c̄_j > 0:
    pick an entering variable j with c̄_j > 0
    compute the updated column ā_j = B⁻¹ a_j
    if ā_j ≤ 0 entry-wise: LP is unbounded — stop
    minimum-ratio test: pick leaving variable i = argmin { x_i / ā_{ij} : ā_{ij} > 0 }
    pivot: swap j into the basis, i out
return current BFS as optimal

Minimum-ratio test. Increasing non-basic variable \(j\) from \(0\) while holding all other non-basics at \(0\) drives each basic variable \(x_i\) at rate \(\bar a_{ij}\). The first basic variable to hit zero is \(\arg\min_i \{x_i / \bar a_{ij} : \bar a_{ij} > 0\}\). This variable leaves the basis.

Unboundedness. If \(\bar a_j \le 0\) for the entering variable \(j\), increasing \(j\) never drives any basic variable negative: the feasible region is unbounded in the direction of \(j\) and the LP has no finite optimum.

Initialization (Phase I). For LPs with \(b \ge 0\), the slack variables form an initial BFS: set \(x = 0\), \(s = b\). For general \(b\), Phase I finds a starting BFS by solving an auxiliary LP that minimizes the sum of artificial variables added to make the system feasible. If the Phase I optimum is zero, the artificials are all zero and we have a valid starting basis; otherwise the original LP is infeasible.

Walkthrough

Use the LP from the linear programming geometry example (converted to maximization slack form):

\[ \text{maximize } 3x_1 + 2x_2, \quad x_1 + x_2 + s_1 = 4, \quad x_1 + s_2 = 3, \quad x_2 + s_3 = 3, \quad \text{all variables} \ge 0. \]

The diagram below shows the three vertices visited and the two pivot steps as a walk on the polytope.

Simplex walk on the feasible polytope. Pivot 1 moves along the bottom edge (\(x_2=0\)) from A to B, entering \(x_1\) into the basis. Pivot 2 moves up the right edge (\(x_1=3\)) from B to optimal vertex C, entering \(x_2\). The shaded region is the feasible polytope; dashed lines are the constraint boundaries.

Initial tableau

Initial basis \(B = \{s_1, s_2, s_3\}\), so \(x_1 = x_2 = 0\).

	\(x_1\)	\(x_2\)	\(s_1\)	\(s_2\)	\(s_3\)	RHS
\(s_1\)	1	1	1	0	0	4
\(s_2\)	1	0	0	1	0	3
\(s_3\)	0	1	0	0	1	3
obj	3	2	0	0	0	0

The objective row shows reduced costs \(\bar c_j\). Both \(x_1\) and \(x_2\) have positive reduced cost; the current BFS has objective \(0\).

Pivot 1: enter \(x_1\) (reduced cost 3), leave \(s_2\) (minimum ratio \(3/1 = 3\))

Ratios: \(s_1\) row gives \(4/1 = 4\); \(s_2\) row gives \(3/1 = 3\); \(s_3\) row has coefficient \(0\), so \(\infty\). Minimum ratio is \(3\), so \(s_2\) leaves.

Divide the \(s_2\) row by \(1\) (pivot element). Eliminate \(x_1\) from all other rows.

	\(x_1\)	\(x_2\)	\(s_1\)	\(s_2\)	\(s_3\)	RHS
\(s_1\)	0	1	1	\(-1\)	0	1
\(x_1\)	1	0	0	1	0	3
\(s_3\)	0	1	0	0	1	3
obj	0	2	0	\(-3\)	0	9

BFS: \(x_1 = 3\), \(x_2 = 0\), \(s_1 = 1\), \(s_3 = 3\). Objective \(= 9\).

Pivot 2: enter \(x_2\) (reduced cost 2), leave \(s_1\) (minimum ratio \(1/1 = 1\))

Ratios: \(s_1\) row gives \(1/1 = 1\); \(x_1\) row has coefficient \(0\), so \(\infty\); \(s_3\) row gives \(3/1 = 3\). Minimum is \(1\).

	\(x_1\)	\(x_2\)	\(s_1\)	\(s_2\)	\(s_3\)	RHS
\(x_2\)	0	1	1	\(-1\)	0	1
\(x_1\)	1	0	0	1	0	3
\(s_3\)	0	0	\(-1\)	1	1	2
obj	0	0	\(-2\)	\(-1\)	0	11

All reduced costs \(\le 0\). Optimal: \(x_1 = 3\), \(x_2 = 1\), objective \(= 11\). This matches vertex \(C\) from the geometry diagram.

Correctness

Soundness. When the algorithm halts with \(\bar c_j \le 0\) for all non-basic \(j\), no edge improves the objective, so the current vertex is locally optimal. Convexity of the LP elevates “locally optimal at a vertex” to “globally optimal.”

Termination via Bland’s rule. A BFS is degenerate if some basic variable has value zero — that is, more than \(n\) constraints are tight at the same vertex. A pivot at a degenerate BFS may leave the objective unchanged (the entering variable enters at level \(0\)). In the worst case, a sequence of degenerate pivots can cycle back to a previously visited basis, looping forever.

Bland’s rule (Bland 1977): enter the non-basic variable with the smallest index among those with positive reduced cost; break minimum-ratio ties the same way. This eliminates cycling and guarantees finite termination (proof). Most practical solvers handle degeneracy through perturbation (slightly randomizing the right-hand side \(b\)) or the lexicographic rule, both of which also guarantee finite termination.

Complexity and Tradeoffs

The worst-case number of pivots is exponential. The Klee–Minty cube (Klee and Minty 1972) is a family of LPs in \(n\) variables where the most-positive-reduced-cost rule visits all \(2^n\) vertices.

In practice, simplex is fast — typically \(O(\text{poly}(n, m))\) pivots on real-world instances. Smoothed analysis explains this: if the input is perturbed by a tiny random noise, simplex runs in polynomial time with high probability. This formal result matches the observed behavior on practical data.

Pivot rules

The algorithm is correct for any choice of entering variable with positive reduced cost, but the pivot rule affects termination and speed.

Dantzig’s rule. Enter the variable with the most positive reduced cost. Intuitive and fast in practice.
Bland’s rule. Smallest-index entering variable. Slower in practice but guarantees termination — useful as a fallback when cycling is detected.
Steepest edge. Enter the variable whose pivot direction has the largest normalized objective improvement. Expensive per step but often pays off in pivot count; the default in most production solvers.

When to Use It

Situation	Approach
LP with warm start available (branch-and-bound, parametric resolves)	Simplex — exploits the previous basis.
Massive sparse LP, cold start	Interior-point methods are often faster — but the simplex final pivot can polish to a vertex.
Need an exact vertex solution (e.g., for ILP)	Simplex — interior-point returns interior-of-face solutions.
Highly degenerate LP	Use a steepest-edge or Bland’s-rule simplex; consider an interior-point method.
Very large LP, dense	Interior-point methods scale better in \(n\).

Variants

Dual simplex. Maintains dual feasibility instead of primal feasibility, pivoting toward primal feasibility. The standard choice for warm-starting after a constraint is added (e.g., in branch-and-bound).
Revised simplex. Stores \(\mathbf{B}^{-1}\) implicitly rather than the full tableau, reducing memory and per-pivot work on sparse problems.
Parametric simplex. Solves a family of LPs parameterized by one parameter (e.g., one varying coefficient) in a single pass.

References

Bland, Robert G. 1977. “New Finite Pivoting Rules for the Simplex Method.” Mathematics of Operations Research 2 (2): 103–7. https://doi.org/10.1287/moor.2.2.103.

Dantzig, George B. 1963. Linear Programming and Extensions. Princeton University Press. https://press.princeton.edu/books/paperback/9780691059136/linear-programming-and-extensions.

Klee, Victor, and George J. Minty. 1972. “How Good Is the Simplex Algorithm?” In Inequalities III, edited by Oved Shisha. Academic Press.