Closest Pair of Points

Motivation

Given \(n\) points in the plane, find the pair with minimum Euclidean distance. The brute-force algorithm checks all \(\binom{n}{2}\) pairs in \(\Theta(n^2)\) time. A clever divide-and-conquer algorithm achieves \(\Theta(n \log n)\), which is asymptotically optimal: any closest-pair algorithm must in particular sort the points, and sorting takes \(\Omega(n \log n)\) in the comparison model (Kleinberg and Tardos 2005; Cormen et al. 2009).

The algorithm is a textbook illustration of how divide and conquer beats the obvious quadratic baseline when the geometry of the problem permits a clever combine step. The combine step here is the key insight: we never need to check more than a handful of points across the dividing line.

Problem

Input: \(n\) points \(P = \{p_1, \dots, p_n\}\) in \(\mathbb{R}^2\), with \(p_i = (x_i, y_i)\). Output: the minimum Euclidean distance \[ \delta^* = \min_{i \ne j} \|p_i - p_j\|, \] together with a pair achieving it.

We assume the points are in general position (no two share an \(x\)-coordinate); ties can be broken with a secondary \(y\)-comparison without affecting the algorithm.

Examples

Closest pair across the dividing line

A naive divide-and-conquer that just recursed on the two halves and returned \(\min(\delta_L, \delta_R)\) would be wrong on this four-point input:

point	\((x, y)\)
\(A\)	\((0, 0)\)
\(B\)	\((0, 5)\)
\(C\)	\((2, 0)\)
\(D\)	\((2, 5)\)

Splitting at \(x_{\text{mid}} = 1\) gives left half \(\{A, B\}\) and right half \(\{C, D\}\). Both recursive calls return distance \(5\) (the vertical pairs). But the actual closest pairs are \(A\)–\(C\) and \(B\)–\(D\), both with distance \(2\) — they span the dividing line. Without the combine step, the algorithm reports \(5\) when the answer is \(2\).

This is exactly why the algorithm needs a third case beyond “recurse left, recurse right”: cross-line pairs can win.

Brute force on the same input does five comparisons

For four points, brute force checks \(\binom{4}{2} = 6\) pairs and finds the answer immediately. The divide-and-conquer machinery only pays off once \(n\) is large enough that \(n^2 \gg n \log n\) — but the cross-line subtlety from the four-point example does not depend on \(n\) and must be handled at every level of the recursion.

Key Ideas

After the two recursive calls return, the closest pair within either half has distance \(\ge \delta = \min(\delta_L, \delta_R)\). The closest pair overall is either also \(\ge \delta\), or it spans the dividing line with both points within \(\delta\) of \(x_{\text{mid}}\). The set of candidates is therefore confined to a vertical strip of width \(2\delta\) around the dividing line.

Two geometric facts make the strip cheap to scan:

Sparsity. If two strip points have \(y\)-coordinates differing by more than \(\delta\), their Euclidean distance is also more than \(\delta\). So when the strip is sorted by \(y\), only nearby entries need to be compared.

Constant packing. For any point \(p\) in the strip, the set of strip points with \(y\)-coordinate within \(\delta\) of \(p\) lies inside a \(2\delta \times \delta\) rectangle. Tile this rectangle with eight \(\delta/2 \times \delta/2\) squares. Each square sits entirely in \(L\) or entirely in \(R\) and has diameter \(\delta\sqrt{2}/2 < \delta\), so it contains at most one of our points (otherwise that pair would already be closer than \(\delta\), contradicting \(\delta\) being the within-half minimum). The rectangle therefore holds at most \(7\) candidates besides \(p\).

So each strip point compares itself only to the next \(7\) entries in \(y\)-order — a constant. The whole strip-scan is \(\Theta(|\text{strip}|) = O(n)\).

Deriving the Solution

The cost of one recursive call breaks into three parts: the two recursions on size-\((n/2)\) subproblems, plus the combine work. If the combine work is \(\Theta(n)\), the recurrence is

\[ T(n) = 2 \, T(n/2) + \Theta(n), \]

which by the master theorem (with \(a = 2\), \(b = 2\), \(c = 1\)) gives \(T(n) = \Theta(n \log n)\).

Linear-time combine demands two things: producing the strip in \(O(n)\), and scanning it in \(O(|\text{strip}|)\). Both require the points to be available in \(y\)-sorted order at every recursive call, not just at the top. Re-sorting at each level would cost \(O(n \log n)\) per level and balloon the total to \(O(n \log^2 n)\).

The fix is to maintain \(y\)-sorted order through the recursion: at each call, partition the inherited \(y\)-sorted list into the \(L\)-members and the \(R\)-members. Membership is decided by \(x\)-coordinate against \(x_{\text{mid}}\), and a single linear pass produces the two \(y\)-sorted child lists. Pre-sort by \(y\) once at the top; then the per-level work stays \(\Theta(n)\) and the recurrence holds.

Algorithm

Pre-process by sorting points by \(x\)-coordinate (call this \(P_x\)) and by \(y\)-coordinate (call this \(P_y\)). The sorted lists are passed to the recursion.

closest_pair(P_x, P_y):
    n = |P_x|
    if n <= 3:
        return brute-force closest pair

    mid = n // 2
    L = P_x[1..mid], R = P_x[mid+1..n]
    let x_mid = x-coordinate of P_x[mid]

    L_y, R_y = partition P_y by membership in L vs. R   # O(n)

    delta_L = closest_pair(L_x sorted view, L_y)
    delta_R = closest_pair(R_x sorted view, R_y)
    delta = min(delta_L, delta_R)

    # Combine: check pairs spanning the dividing line within delta
    strip = points in P_y with |x - x_mid| < delta      # already y-sorted
    for i in 0 .. len(strip) - 1:
        for j in i+1 .. min(i+7, len(strip) - 1):
            delta = min(delta, dist(strip[i], strip[j]))
    return delta

The constant 7 in the inner loop is the packing bound from Key Ideas: any two strip points more than seven positions apart in \(y\)-order are guaranteed to be at distance \(\ge \delta\), so checking further is wasted work.

Walkthrough

Eight points: \(A(1,8), B(3,1), C(2,4), D(6,6)\) on the left and \(E(8,5), F(10,8), G(13,6), H(11,2)\) on the right. The dividing line is at \(x_{\text{mid}} = 7\).

The recursive calls return \(\delta_L = \|B - C\| = \sqrt{10}\) from the left and \(\delta_R = \|F - H\| = \sqrt{13}\) from the right, so \(\delta = \sqrt{10} \approx 3.16\). The strip \(|x - 7| < \delta\) contains only \(D, E, F\) — five of the eight points are too far from the dividing line to participate. Sorted by \(y\), the strip is \(E, D, F\), and the consecutive comparison \(D \leftrightarrow E\) yields distance \(\sqrt{5} \approx 2.24\), the new closest pair. Without the combine step, the algorithm would have missed this cross-line pair entirely.

Correctness

By induction on \(n\). The base case (brute force on \(\le 3\) points) is correct by exhaustion. For the recursive case, the closest pair either lies entirely in \(L\), entirely in \(R\), or spans the dividing line. The first two cases are covered by the recursive calls (inductive hypothesis). The third is covered by the combine step: any cross-line pair closer than \(\delta\) has both endpoints in the strip, and the packing lemma guarantees that the closer endpoint is at most seven positions away in \(y\)-order, so the bounded inner loop examines it.

Complexity and Tradeoffs

Pre-sorting both lists costs \(O(n \log n)\) once. The recursive cost obeys

\[ T(n) = 2 T(n/2) + \Theta(n), \]

which the master theorem resolves to \(\Theta(n \log n)\). Total time is \(\Theta(n \log n)\).

Auxiliary memory is \(\Theta(n)\) for the sorted lists and the per-level partition buffers.

The bound is asymptotically optimal: closest pair is at least as hard as element distinctness (given \(n\) real numbers, decide whether all are distinct), which has an \(\Omega(n \log n)\) lower bound in the algebraic decision tree model. So no algorithm can do asymptotically better than \(\Theta(n \log n)\) in this model.

Important tradeoffs:

Constant factors: the divide-and-conquer code is more involved than brute force, and brute force wins on small \(n\) (\(n \lesssim 30\) in practice). Production implementations switch to brute force at the recursion base case.
Pre-sorting requirement: the algorithm needs both \(x\)- and \(y\)-sorted views, threaded through the recursion. A naive re-sort at each level inflates the time to \(\Theta(n \log^2 n)\).
Floating-point comparisons: comparing squared distances avoids square roots until the final answer, which is faster and avoids one source of rounding error.
Dimension: in \(d\) dimensions the strip-scan constant grows exponentially in \(d\); the algorithm degrades and other techniques take over.

When to Use It

Use planar divide-and-conquer when \(n\) is large enough that quadratic time is unacceptable and the points live in low-dimensional Euclidean space. The broader pattern is geometric divide and conquer with a constant-time combine per element: split by a coordinate, recurse on the two halves, and use the geometry of the problem to bound how many cross-cut pairs need to be examined.

Situation	Typical algorithm
Small input (\(n \lesssim 30\))	Brute force — \(\Theta(n^2)\) with tiny constants
Planar exact closest pair	Divide and conquer — \(\Theta(n \log n)\)
Many queries on a fixed point set	Build a \(k\)-d tree or Voronoi diagram once, then query repeatedly
Approximate closest pair, low dimension	\(k\)-d tree with pruning
Approximate nearest neighbour, high dimension	Locality-sensitive hashing, random projection — exact methods hit the curse of dimensionality

Variants

Higher dimensions. The same template extends to \(\mathbb{R}^d\), but the strip-scan constant grows exponentially in \(d\), giving \(O(n \log^{d-1} n)\). In high dimension, exact closest pair has no truly subquadratic algorithm — this is the curse of dimensionality, related to nearest-neighbour search.
Randomized linear time. Rabin’s randomized algorithm achieves expected \(O(n)\) in the floor function model by hashing points into a grid of cell-side \(\delta_0\) for a sampled candidate distance \(\delta_0\).
Bichromatic closest pair. Given red and blue point sets, find the closest red-blue pair. The same divide-and-conquer template works with an adjusted strip scan.
Dynamic closest pair. Maintain the closest pair under insertions and deletions; supported by data structures based on Voronoi diagrams or randomized incremental construction.

References

Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms. 3rd ed. MIT Press. https://mitpress.mit.edu/9780262533058/introduction-to-algorithms/.

Kleinberg, Jon, and Éva Tardos. 2005. Algorithm Design. Pearson/Addison-Wesley. https://www.pearson.com/en-us/subject-catalog/p/algorithm-design/P200000003259.