Alpha-Beta Search

Motivation

Minimax search computes the value of a game tree by examining every node, but most of that work is wasted: once a branch is known to be worse for the player to move than some alternative already considered, the exact value of that branch does not matter. Alpha-beta search detects this case and prunes the irrelevant subtree (Knuth and Moore 1975; Pearl 1982). It returns the same minimax value as full minimax, but with dramatically less work (Russell and Norvig 2020).

Problem

A two-player zero-sum game with perfect information, as in minimax search. Given the initial state \(s_0\), compute the minimax value \(V(s_0)\) (and the corresponding optimal move) while examining as few nodes as possible.

Key Ideas

The pruning insight is that a subtree’s value matters only if a parent could choose it. As soon as we can prove a parent would prefer something else, the subtree’s exact value is irrelevant — we may report any value consistent with our bounds and move on.

To make this concrete, the algorithm maintains two bounds along the current path from the root:

\(\alpha\) — the best value MAX can guarantee so far. MAX will never accept less than \(\alpha\).
\(\beta\) — the best value MIN can guarantee so far. MIN will never accept more than \(\beta\).

The invariant \(\alpha \le V(s) \le \beta\) describes the window of values that could still affect the root decision. If at any point \(\alpha \ge \beta\), the current node is irrelevant: the parent has a strictly better alternative and will not choose this branch.

Deductive, not heuristic. The pruning argument does not depend on the quality of the evaluation function or any property of the game. Whatever the true value \(V(s)\) turns out to be, if it lies outside \((\alpha, \beta)\) the parent will not pick it. Alpha-beta therefore returns the exact minimax value of the depth-limited tree — it is a faithful implementation of minimax, not an approximation.

Algorithm

function ALPHABETA(s, alpha, beta):
    if Terminal(s) or depth == 0: return EVAL(s)
    if Player(s) == MAX:
        v = -infinity
        for a in Actions(s):
            v = max(v, ALPHABETA(Result(s, a), alpha, beta))
            if v >= beta: return v          # beta cutoff
            alpha = max(alpha, v)
        return v
    else:                                    # Player(s) == MIN
        v = +infinity
        for a in Actions(s):
            v = min(v, ALPHABETA(Result(s, a), alpha, beta))
            if v <= alpha: return v         # alpha cutoff
            beta = min(beta, v)
        return v

The root is called with \(\alpha = -\infty\), \(\beta = +\infty\).

The two early-return lines are the cutoffs:

Beta cutoff at a MAX node: MAX has found a move with value \(\ge \beta\). MIN, choosing one level up, would never pick this MAX node over its existing \(\beta\)-bounded alternative, so the remaining moves at this MAX node cannot affect the answer.
Alpha cutoff at a MIN node: symmetric. MIN has found a move with value \(\le \alpha\), and MAX will not let the game reach this MIN node when a better-for-MAX option already exists.

Walkthrough

Pruning a depth-2 game tree

A standard textbook tree (Russell and Norvig 2020): root is MAX with three MIN children \(B, C, D\), each having three leaves. Leaves are visited left-to-right.

After examining \(B\), the root has \(\alpha = 3\). Visiting \(C\)’s first leaf reveals \(v = 2\), which already satisfies \(v \le \alpha\) — MIN at \(C\) will return something \(\le 2\), and MAX at the root will not pick \(C\) when it already has \(B\) worth \(3\). So \(C\)’s remaining leaves \(4\) and \(6\) are pruned. At \(D\), the first two leaves do not trigger a cutoff, but the last leaf \(2\) does — only after the entire subtree has been examined, so no nodes are saved. The minimax value at the root is \(3\), and only \(7\) of \(9\) leaves were evaluated.

Correctness

Alpha-beta returns the same value as minimax at the root. The pruned subtrees are exactly those whose values cannot influence the root: when the algorithm cuts off at a MAX node with \(v \ge \beta\), the parent MIN already has a child with value \(\le \beta \le v\), so the cut node’s true value (whatever it is, \(\ge v \ge \beta\)) cannot be selected by that MIN. The induction up the tree confirms the root value is unchanged. (proof)

Note that alpha-beta does not in general return the same minimax value at internal nodes — only the root and any node whose true value lies inside the \((\alpha, \beta)\) window. Cut-off nodes return only a bound on their value, which is sufficient for the parent.

Complexity and Tradeoffs

The benefit depends entirely on move ordering.

Worst case (adversarial ordering). If the worst move is examined first at every node, no pruning occurs and the algorithm degenerates to plain minimax: \(O(b^d)\).

Best case (perfect ordering). If the best move is examined first at every node, alpha-beta examines

\[ O(b^{\lceil d/2 \rceil} + b^{\lfloor d/2 \rfloor}) = O(b^{d/2}) \]

nodes — equivalent to full minimax with branching factor \(\sqrt{b}\). For chess this means roughly doubling the searchable depth in a fixed time budget. (proof)

Random ordering. With uniformly random move ordering and i.i.d. uniform leaf values, the expected node count is \(O(b^{3d/4})\), between the two extremes. (proof)

The best case is meaningful because the order of examination is under our control: spending effort to order moves well pays for itself many times over. Standard ordering techniques — examining captures before quiet moves, the principal variation from the previous iteration first, killer moves, history heuristics — bring real chess engines close to the \(O(b^{d/2})\) ideal.

When to Use It

Situation	Use alpha-beta?
Two-player zero-sum game with a usable evaluator	Yes — the default for chess, checkers, and similar games.
Combine with iterative deepening for time control	Yes — alpha-beta + ID + transposition tables is the standard tournament setup.
Massive branching factor, weak evaluator (e.g., Go)	Prefer Monte Carlo tree search.
Stochastic moves (dice, chance nodes)	Use expectimax instead — alpha-beta’s bounds rely on adversarial choice.
Imperfect-information games (poker)	Not applicable — alpha-beta assumes perfect information.

Variants

Aspiration windows. Run alpha-beta with a narrow window \((\alpha_0 - \delta, \alpha_0 + \delta)\) around the previous iteration’s value. Most positions return inside the window quickly; rare “fail-high” or “fail-low” results trigger a re-search with a wider window.
Principal variation search (PVS) / NegaScout. Search the first move with a full window and the rest with a zero-width window, re-searching only when a move surprises. Achieves further speedups when the first move is usually best.
Transposition tables. Cache the value (or bound) of each node by its hash, recognizing that the same position can be reached by different move orders. Combines with alpha-beta cleanly: cached bounds tighten \(\alpha\) and \(\beta\) on subsequent visits.
Iterative deepening. Repeated alpha-beta to increasing depths, using one iteration’s principal variation to order moves in the next. Standard practice in tournament play.

References

Knuth, Donald E., and Ronald W. Moore. 1975. “An Analysis of Alpha-Beta Pruning.” Artificial Intelligence 6 (4): 293–326. https://doi.org/10.1016/0004-3702(75)90019-3.

Pearl, Judea. 1982. “The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm and Its Optimality.” Communications of the ACM 25 (8): 559–64. https://doi.org/10.1145/358589.358616.

Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. 4th ed. Pearson. https://www.pearson.com/en-us/subject-catalog/p/artificial-intelligence-a-modern-approach/P200000003500/9780137505135.