Runtime and Asymptotic Analysis

Motivation

When comparing algorithms we rarely care about the exact number of operations on a particular machine — we care about how the running time grows with input size \(n\). Asymptotic analysis abstracts away constant factors, hardware, and lower-order terms so that claims like “mergesort beats insertion sort on large inputs” have a precise meaning that holds across machines and implementations (Cormen et al. 2009; Kleinberg and Tardos 2005).

Without an asymptotic vocabulary, a claim such as “this algorithm is faster” is meaningless: faster on which inputs, on which hardware, in which language? Big-O notation gives us a single, machine-independent answer.

Big-O, Big-Omega, Big-Theta

Let \(T(n)\) be the running time of an algorithm on inputs of size \(n\), and let \(f(n)\) be a reference function (e.g., \(n\), \(n \log n\), \(n^2\)).

  • Upper bound (\(O\)): \(T(n) = O(f(n))\) if there exist constants \(c > 0\) and \(n_0 \ge 0\) such that \(T(n) \le c \cdot f(n)\) for all \(n \ge n_0\).
  • Lower bound (\(\Omega\)): \(T(n) = \Omega(f(n))\) if there exist constants \(c > 0\) and \(n_0 \ge 0\) such that \(T(n) \ge c \cdot f(n)\) for all \(n \ge n_0\).
  • Tight bound (\(\Theta\)): \(T(n) = \Theta(f(n))\) if both \(T(n) = O(f(n))\) and \(T(n) = \Omega(f(n))\).

The strict variants \(o(f)\) and \(\omega(f)\) require the ratio \(T(n)/f(n)\) to tend to \(0\) or \(\infty\) respectively.

Why constants don’t matter

A constant-factor speedup can always be matched by waiting for hardware to improve. An asymptotically faster algorithm cannot. If \(T_A(n) = 100 n \log n\) and \(T_B(n) = n^2\), then \(A\) overtakes \(B\) at \(n \approx 1{,}000\) regardless of how much faster \(B\)’s constant factor is on small inputs.

Asymptotic analysis is also robust to model details: counting comparisons, RAM operations, or word-RAM steps yields the same big-O class for almost every standard algorithm.

Worst, average, and best case

Running time depends on the input, not just its size:

  • Worst-case \(T(n)\): the maximum cost over all inputs of size \(n\). The default in algorithm analysis because it gives a guarantee.
  • Average-case \(T(n)\): the expected cost under some distribution over inputs of size \(n\). Useful when worst cases are pathological and rarely encountered.
  • Best-case: the minimum, almost never quoted alone — it can be misleading.

Example: comparing two loop procedures

Consider two procedures on an array \(A\) of \(n\) numbers.

Procedure A — compute the sum (single loop):

total = 0
for i = 1 to n:
    total = total + A[i]

Procedure B — compute all pairwise sums (nested loops):

for i = 1 to n:
    for j = i+1 to n:
        print A[i] + A[j]

Counting additions:

Procedure Additions performed Asymptotic class
A \(n\) \(\Theta(n)\)
B \(\frac{n(n-1)}{2}\) \(\Theta(n^2)\)

In Procedure B the outer loop runs \(n\) times; for outer index \(i\) the inner loop runs \(n - i\) times, giving \(\sum_{i=1}^{n-1}(n-i) = \frac{n(n-1)}{2}\) total additions.

To see that this is \(\Theta(n^2)\): the upper bound uses \(\frac{n(n-1)}{2} \le \frac{n^2}{2}\), giving \(c = \tfrac{1}{2}\). The lower bound uses \(\frac{n(n-1)}{2} \ge \frac{n^2}{4}\) for \(n \ge 2\), giving \(c = \tfrac{1}{4}\). Both hold from \(n_0 = 2\).

Even if Procedure A is slower by a factor of 1000 on today’s hardware — say \(T_A(n) = 1000n\) — Procedure B performs more additions for all \(n > 2001\). The quadratic term eventually dominates regardless of the constant.

Common growth rates

Ordered from slowest- to fastest-growing: \[ 1 \prec \log n \prec \sqrt{n} \prec n \prec n \log n \prec n^2 \prec n^3 \prec 2^n \prec n! \]

A standard mental model: anything polynomial (\(n^c\) for fixed \(c\)) is considered tractable; anything exponential (\(c^n\) for \(c > 1\)) is generally intractable for large \(n\).

Caveats

Asymptotic bounds can hide enormous constants. The Coppersmith–Winograd matrix-multiplication algorithm is asymptotically faster than Strassen but is never used in practice because the crossover point is astronomically large. In real systems, cache behavior, branch prediction, and memory access patterns can dominate.

Big-O is therefore a necessary but not sufficient tool for choosing an algorithm: it tells you which algorithms cannot work for large \(n\), but not which will be fastest at the size you care about.

References

Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms. 3rd ed. MIT Press. https://mitpress.mit.edu/9780262533058/introduction-to-algorithms/.
Kleinberg, Jon, and Éva Tardos. 2005. Algorithm Design. Pearson/Addison-Wesley. https://www.pearson.com/en-us/subject-catalog/p/algorithm-design/P200000003259.