Integer Multiplication
Motivation
The grade-school algorithm for multiplying two \(n\)-digit integers takes \(\Theta(n^2)\) digit operations. For two thousand years, this was assumed to be optimal — until Karatsuba showed in 1960 that with one extra addition, you can replace four multiplications with three, saving an asymptotic factor and giving \(\Theta(n^{\log_2 3}) \approx \Theta(n^{1.585})\) (Karatsuba and Ofman 1963; Kleinberg and Tardos 2005).
Karatsuba’s discovery is a foundational result in algorithms: the first proof that a “natural” \(\Theta(n^2)\) algorithm could be improved by exploiting algebraic identities. It set off a sequence of progressively faster algorithms (Toom–Cook, Schönhage–Strassen, and most recently Harvey & van der Hoeven’s \(O(n \log n)\) algorithm) that culminated in a tight bound matching the lower bound up to constants.
For practical big-integer arithmetic — cryptography, computer algebra, big-data analytics — Karatsuba is the algorithm of choice for moderate sizes (hundreds to a few thousand digits) and a building block for the truly fast algorithms used at very large sizes.
Problem
Multiply two \(n\)-digit non-negative integers \(X\) and \(Y\) in some base \(B\), producing the up-to-\(2n\)-digit product \(X \cdot Y\). The cost model counts single-digit additions and single-digit multiplications, each at unit cost.
We focus on the asymptotic dependence on \(n\); the choice of base affects only constants.
Examples
Schoolbook on two digits
Multiply \(X = 12\) and \(Y = 34\) in base \(10\). The grade-school algorithm computes every digit-pair product and sums with shifts:
| \(X_i\) | \(Y_j\) | \(X_i Y_j\) | shift \(B^{i+j}\) | contribution |
|---|---|---|---|---|
| \(2\) | \(4\) | \(8\) | \(1\) | \(8\) |
| \(2\) | \(3\) | \(6\) | \(10\) | \(60\) |
| \(1\) | \(4\) | \(4\) | \(10\) | \(40\) |
| \(1\) | \(3\) | \(3\) | \(100\) | \(300\) |
Sum: \(300 + 40 + 60 + 8 = 408 = 12 \cdot 34\). Four digit-multiplications were needed for two digits each — the general pattern is \(n^2\) products for \(n\)-digit operands, giving \(\Theta(n^2)\).
Naive divide and conquer does not help
Split each operand at the middle:
\[ X = X_1 \cdot B^{n/2} + X_0, \qquad Y = Y_1 \cdot B^{n/2} + Y_0. \]
Expanding the product gives
\[ X \cdot Y = X_1 Y_1 \cdot B^{n} + (X_1 Y_0 + X_0 Y_1) \cdot B^{n/2} + X_0 Y_0, \]
so the work reduces to four multiplications of \(n/2\)-digit numbers plus \(\Theta(n)\) additions and shifts:
\[ T(n) = 4 \, T(n/2) + \Theta(n). \]
By the master theorem this is \(\Theta(n^{\log_2 4}) = \Theta(n^2)\) — the same as schoolbook. Splitting into halves on its own changes nothing because the recursion still has to do four sub-multiplications.
Key Ideas
The four sub-products \(X_1 Y_1\), \(X_1 Y_0\), \(X_0 Y_1\), \(X_0 Y_0\) are not independent. The two cross terms appear together as the middle coefficient \(X_1 Y_0 + X_0 Y_1\), and one extra product captures their sum:
\[ (X_1 + X_0)(Y_1 + Y_0) = X_1 Y_1 + X_1 Y_0 + X_0 Y_1 + X_0 Y_0. \]
So if we compute
\[ A = X_1 Y_1, \qquad B = X_0 Y_0, \qquad C = (X_1 + X_0)(Y_1 + Y_0), \]
then the cross terms are recovered as \(C - A - B\), and
\[ X \cdot Y = A \cdot B^{n} + (C - A - B) \cdot B^{n/2} + B. \]
Three sub-multiplications instead of four, at the cost of two extra additions of \(n/2\)-digit numbers and two extra subtractions. The arithmetic overhead is \(\Theta(n)\) per recursive call — too small to dominate.
The base \(\log_2 3 \approx 1.585\) is the asymptotic exponent; the saving compounds across \(\log_2 n\) levels of recursion, beating \(n^2\) by a polynomial factor.
Deriving the Solution
With three sub-multiplications and \(\Theta(n)\) overhead per level:
\[ T(n) = 3 \, T(n/2) + \Theta(n), \qquad T(1) = \Theta(1). \]
The master theorem (with \(a = 3\), \(b = 2\), \(c = 1\), so \(\log_b a = \log_2 3 > c\)) gives
\[ T(n) = \Theta(n^{\log_2 3}) \approx \Theta(n^{1.585}). \]
A direct argument: at level \(\ell\) of the recursion tree there are \(3^\ell\) subproblems of size \(n / 2^\ell\), and the per-level work grows geometrically as \(3^\ell \cdot \Theta(n / 2^\ell) = \Theta(n \cdot (3/2)^\ell)\). Summing the geometric series across \(\log_2 n\) levels is dominated by the leaves, yielding \(\Theta(3^{\log_2 n}) = \Theta(n^{\log_2 3})\).
Algorithm
karatsuba(X, Y):
n = max(digit_length(X), digit_length(Y))
if n <= n0: # base case: schoolbook
return schoolbook_multiply(X, Y)
m = n // 2
X1, X0 = split(X, m) # X = X1 * B^m + X0
Y1, Y0 = split(Y, m)
A = karatsuba(X1, Y1)
B = karatsuba(X0, Y0)
C = karatsuba(X1 + X0, Y1 + Y0)
middle = C - A - B
return A * B^(2m) + middle * B^m + B
The base-case threshold \(n_0\) is tuned per platform — typically tens of digits, where schoolbook’s smaller constants beat Karatsuba’s recursive overhead.
The sums \(X_1 + X_0\) and \(Y_1 + Y_0\) may carry into one extra digit, so the recursive call \(C\) is on operands of size \(n/2 + 1\) rather than exactly \(n/2\). This costs only constant factors and does not affect the asymptotics.
Walkthrough
Multiply \(X = 1234\) by \(Y = 5678\) in base \(10\). Split each four-digit number at the middle: \(X_1 = 12\), \(X_0 = 34\), \(Y_1 = 56\), \(Y_0 = 78\), with \(n = 4\) and \(m = 2\).
If we descended one more level, each two-digit multiplication would itself spawn three single-digit multiplications instead of four — the savings compound.
Correctness
Direct algebra. The identity
\[ A \cdot B^{n} + (C - A - B) \cdot B^{n/2} + B = X_1 Y_1 \cdot B^{n} + (X_1 Y_0 + X_0 Y_1) \cdot B^{n/2} + X_0 Y_0 \]
reduces to checking that \(C - A - B = X_1 Y_0 + X_0 Y_1\), which expands directly from the definition of \(C = (X_1 + X_0)(Y_1 + Y_0)\). Induction on \(n\) then carries the result through the recursion: assuming the recursive calls return the correct sub-products, the recombination is exact.
Complexity and Tradeoffs
The recurrence \(T(n) = 3 \, T(n/2) + \Theta(n)\) resolves to
\[ T(n) = \Theta(n^{\log_2 3}) \approx \Theta(n^{1.585}) \]
digit operations. (proof)
Auxiliary memory is \(\Theta(n)\) for the recursion stack and intermediate operand buffers.
Important tradeoffs:
- Constant factors: Karatsuba does more additions and bookkeeping per level than schoolbook. In practice, schoolbook is faster for \(n \lesssim 30\) digits; the recursive code has to bottom out at a schoolbook base case.
- Carry handling: \(X_1 + X_0\) can overflow into one extra digit, which the implementation must accommodate.
- Sign handling: the formulation above assumes non-negative operands; signed multiplication is reduced to unsigned by tracking signs separately.
- Asymptotic optimality: Karatsuba is not asymptotically optimal — \(O(n \log n)\) is achievable (Harvey and Hoeven 2021) — but it is the practical sweet spot for moderate sizes and the simplest of the sub-quadratic algorithms.
When to Use It
Karatsuba dominates schoolbook above a small threshold and remains competitive into the few-thousand-digit range. The broader pattern is the Karatsuba trick: when an algorithm’s recursive structure produces sub-products that share algebraic content, an extra evaluation can replace one of them.
| Operand size | Typical algorithm |
|---|---|
| Single machine word | Hardware multiplier — \(\Theta(1)\) |
| \(n \lesssim 30\) digits | Schoolbook — \(\Theta(n^2)\), smallest constants |
| \(30 \lesssim n \lesssim\) a few thousand | Karatsuba — \(\Theta(n^{1.585})\) |
| Larger | Toom–Cook, then Schönhage–Strassen or other FFT-based |
| Extremely large (\(n \gtrsim 10^6\)) | Harvey & van der Hoeven — \(\Theta(n \log n)\) |
In practice, libraries such as GMP layer these algorithms by size, dispatching to whichever is fastest at a given \(n\).
Variants
Toom–Cook-\(k\). Generalize the split-into-two to split-into-\(k\). Evaluating the polynomial product at \(2k - 1\) points and interpolating gives \(T(n) = (2k-1) \, T(n/k) + \Theta(n)\) with cost \(\Theta(n^{\log_k(2k-1)})\).
\(k\) Multiplications Complexity \(2\) (Karatsuba) \(3\) \(\Theta(n^{1.585})\) \(3\) (Toom-3) \(5\) \(\Theta(n^{1.465})\) \(4\) (Toom-4) \(7\) \(\Theta(n^{1.404})\) The constant factor grows with \(k\), so very large \(k\) is impractical.
Schönhage–Strassen (1971): \(\Theta(n \log n \log \log n)\) via FFT over a finite ring (Schönhage and Strassen 1971).
Fürer (2007): \(\Theta(n \log n \cdot 2^{O(\log^* n)})\) (Fürer 2007).
Harvey & van der Hoeven (2019): \(\Theta(n \log n)\), matching the conjectural lower bound (Harvey and Hoeven 2021).