Bellman-Ford

Motivation

The shortest-path problem has two complications that an algorithm may or may not handle: negative edge weights, which break the greedy finalization that makes Dijkstra fast, and negative cycles, which make distances undefined for reachable vertices.

The Bellman-Ford algorithm handles both (Bellman 1958; Ford 1956). It accepts arbitrary real edge weights and detects reachable negative cycles rather than silently returning a wrong answer. The cost is time: \(\Theta(nm)\) versus Dijkstra’s \(O(m \log n)\) on non-negative graphs.

Problem

Given a directed graph \(G = (V, E)\) with \(n = |V|\) vertices, \(m = |E|\) edges, edge weights \(w : E \to \mathbb{R}\), and a source vertex \(s\), compute the shortest-path distance

\[ d(v) = \min_{P:s \leadsto v} w(P) \]

for every vertex \(v\) reachable from \(s\).

If a negative-weight cycle is reachable from \(s\), distances to vertices reachable beyond it are not finite — the cycle can be traversed repeatedly to make the path weight arbitrarily small. Bellman-Ford detects and reports that case rather than returning a wrong answer.

Examples

A negative edge can be harmless

A graph can contain negative edges and still have perfectly well-defined shortest paths:

edge	weight
\(s \to a\)	\(1\)
\(s \to b\)	\(4\)
\(a \to b\)	\(-2\)
\(b \to t\)	\(1\)

The best route to \(t\) is \(s \to a \to b \to t\) with total weight \(1 - 2 + 1 = 0\). The negative edge helps, but there is no cycle to exploit forever.

A negative cycle makes the question ill-posed

Now add \(b \to a\) with weight \(1\). The cycle \(a \to b \to a\) has weight \(-2 + 1 = -1\). Once a path reaches \(a\), it can loop around this cycle \(r\) times and reduce its cost by \(r\). There is no minimum finite distance to \(a\), \(b\), or anything reachable from them.

Key Ideas

A path with repeated vertices contains a cycle. If the cycle has non-negative weight, removing it cannot increase the total path length. If it has negative weight, the distance to anything reachable beyond it is undefined — the cycle can be traversed arbitrarily many times to drive the cost to \(-\infty\). So when shortest paths are well-defined, every shortest path can be chosen to be simple: no repeated vertices.

A simple path in an \(n\)-vertex graph uses at most \(n - 1\) edges. So \(n - 1\) relaxation passes are enough to compute all shortest-path distances when they are finite. A single additional pass then reveals whether any negative cycle is reachable: any edge that can still be relaxed after \(n - 1\) rounds implies a walk of at least \(n\) edges that keeps improving, which is only possible if a negative cycle is reachable from \(s\).

Deriving the Solution

Define \(\text{OPT}(i, v)\) as the minimum cost of a path from \(s\) to \(v\) using at most \(i\) edges, or \(+\infty\) if no such path exists (Kleinberg and Tardos 2005).

Base cases. With zero edges, only \(s\) is reachable at cost zero: \[ \text{OPT}(0, s) = 0, \qquad \text{OPT}(0, v) = +\infty \text{ for } v \neq s. \]

Recurrence. An optimal \(\leq i\)-edge path to \(v\) either stays within the \(\leq (i{-}1)\)-edge budget, or uses exactly \(i\) edges and arrives via some predecessor \(u\): \[ \text{OPT}(i, v) = \min\!\Bigl(\text{OPT}(i-1, v),\;\min_{(u,v)\in E}\bigl\{\text{OPT}(i-1, u) + w(u,v)\bigr\}\Bigr). \]

Because any simple path uses at most \(n - 1\) edges, \(\text{OPT}(n-1, v) = d(v)\) when no reachable negative cycle exists. Filling the table row by row gives the bounded algorithm:

BellmanFordBounded(G = (V, E), w, s):
    for each vertex v in V:
        OPT[0][v] = +infinity
    OPT[0][s] = 0

    for i = 1 to |V| - 1:
        for each vertex v in V:
            OPT[i][v] = OPT[i-1][v]
        for each edge (u, v) in E:
            if OPT[i-1][u] + w(u, v) < OPT[i][v]:
                OPT[i][v] = OPT[i-1][u] + w(u, v)

    for each edge (u, v) in E:
        if OPT[n-1][u] + w(u, v) < OPT[n-1][v]:
            report "negative cycle reachable from s"

    return OPT[n-1]

Algorithm

The bounded form stores one distance array per iteration, using \(\Theta(n^2)\) space in total. Two simplifications reduce it to a \(\Theta(n)\)-space standard form.

Space reduction. Row \(i\) depends only on row \(i-1\), so two arrays suffice.

In-place update. During round \(i\), if some \(u\) is already updated to a value \(\leq \text{OPT}(i-1, u)\), using that fresher value to relax \((u, v)\) can only decrease \(D[v]\) further. A smaller value is still the cost of some actual walk from \(s\) to \(v\), so \(D[v] \geq d(v)\) is preserved and convergence can only be faster.

Combining these gives the standard single-array form:

Bellman-Ford(G = (V, E), w, s):
    for each vertex v in V:
        D[v] = +infinity
        parent[v] = none
    D[s] = 0

    repeat |V| - 1 times:
        changed = false
        for each edge (u, v) in E:
            if D[u] + w(u, v) < D[v]:
                D[v] = D[u] + w(u, v)
                parent[v] = u
                changed = true
        if not changed:
            break

    for each edge (u, v) in E:
        if D[u] + w(u, v) < D[v]:
            report "negative cycle reachable from s"

    return D, parent

The parent array is optional if only distances are needed. If the algorithm finishes without finding a negative cycle, following parent pointers reconstructs shortest paths to reachable vertices.

Walkthrough

Use the harmless negative-edge graph from above, with one extra edge \(a \to t\) of weight \(5\), and process edges in this fixed order each round:

\[(a,t), (b,t), (a,b), (s,b), (s,a).\]

The graph has edges \(s \to a\) of weight \(1\), \(s \to b\) of weight \(4\), \(a \to b\) of weight \(-2\), \(b \to t\) of weight \(1\), and \(a \to t\) of weight \(5\).

round	\(D[s]\)	\(D[a]\)	\(D[b]\)	\(D[t]\)	what changed
\(0\)	\(0\)	\(\infty\)	\(\infty\)	\(\infty\)	initialize source
\(1\)	\(0\)	\(1\)	\(4\)	\(\infty\)	only edges out of \(s\) can help
\(2\)	\(0\)	\(1\)	\(-1\)	\(5\)	\(a \to b\) improves \(b\); \(a \to t\) reaches \(t\)
\(3\)	\(0\)	\(1\)	\(-1\)	\(0\)	\(b \to t\) uses the improved value of \(b\)

After \(n - 1 = 3\) rounds, the distances are final. The extra pass finds no improving edge, so no reachable negative cycle is detected. The shortest path to \(t\) is recovered by parent pointers as \(s \to a \to b \to t\), with total cost \(0\).

Correctness

The proof uses a loop invariant on \(\text{OPT}(k, v)\) — the shortest-walk cost using at most \(k\) edges, as defined in the derivation above. (proof)

Bounded form. By the recurrence, \(\text{OPT}(i, v)\) is computed exactly from \(\text{OPT}(i-1, \cdot)\), so by induction \(\text{OPT}(k, v) = \delta_k(v)\) holds precisely after filling row \(k\), where \(\delta_k(v)\) denotes the minimum weight over all walks from \(s\) to \(v\) using at most \(k\) edges. Because simple paths use at most \(n - 1\) edges, \(\delta_{n-1}(v) = d(v)\) when no reachable negative cycle exists.

Standard form. In-place updates mean \(D[v]\) may decrease below \(\delta_k(v)\) during round \(k\) (values propagate further in one pass). The invariant becomes an upper bound: after round \(k\), \(D[v] \leq \delta_k(v)\). The lower bound \(D[v] \geq d(v)\) holds throughout because every \(D[v]\) is the cost of some actual walk from \(s\) to \(v\), and \(d(v)\) is the minimum over all such walks. After \(n - 1\) rounds, \(D[v] \leq \delta_{n-1}(v) = d(v) \leq D[v]\), so \(D[v] = d(v)\).

Proof by induction on \(k\).

Base case (\(k = 0\)). \(D[s] = 0 = \delta_0(s)\). For every other vertex \(v\), \(D[v] = +\infty = \delta_0(v)\).

Inductive step. Assume \(D[u] \leq \delta_{k-1}(u)\) for every \(u\) after round \(k - 1\). Any \(\leq k\)-edge walk to \(v\) with cost \(\delta_k(v)\) ends with some edge \((u, v)\), and its prefix is a \(\leq (k{-}1)\)-edge walk to \(u\). During round \(k\), when \((u, v)\) is relaxed, \(D[u]\) is at most \(\delta_{k-1}(u)\) (possibly less due to in-place updates), so: \[D[v] \;\leq\; D[u] + w(u, v) \;\leq\; \delta_{k-1}(u) + w(u, v) \;=\; \delta_k(v).\]

Negative cycle detection. If no negative cycle is reachable, then after \(n - 1\) rounds \(D[v] = d(v)\) for every vertex, and the triangle inequality \(d(v) \leq d(u) + w(u,v)\) ensures no edge can be further relaxed. By contrapositive, any further relaxation after \(n - 1\) rounds implies a reachable negative cycle.

Complexity and Tradeoffs

Bellman-Ford performs up to \(n - 1\) rounds and examines all \(m\) edges per round, so

\[ T(n, m) = \Theta(nm). \]

It stores one distance per vertex and, if paths are needed, one parent pointer per vertex, so the usual memory usage is \(\Theta(n)\) beyond the edge list.

Important tradeoffs:

Negative weights: Bellman-Ford allows arbitrary real edge weights; Dijkstra does not.
Negative-cycle detection: the final pass detects a reachable negative cycle rather than silently returning misleading distances.
Speed: for non-negative weights, Dijkstra is usually much faster.
Early termination: if a full round makes no changes, all distances have already converged and the algorithm can stop.
SPFA-style queues: queueing only vertices whose distances changed can be faster in many instances, but the worst-case time remains \(\Theta(nm)\).

When to Use It

Use Bellman-Ford when the graph is a single-source shortest-path problem and negative edges are possible. The broader pattern is dynamic programming over path length: a \(k\)-edge solution is built by taking a \((k-1)\)-edge solution and adding one last edge.

Situation	Typical algorithm
Unweighted graph	BFS — \(\Theta(n + m)\)
Non-negative weights, single source	Dijkstra — \(O(m \log n)\) with a standard priority queue
Arbitrary weights, single source	Bellman-Ford — \(\Theta(nm)\)
Need to detect reachable negative cycles	Bellman-Ford
Directed acyclic graph, arbitrary weights	Topological-order relaxation — \(\Theta(n + m)\)
All-pairs, dense graph, arbitrary weights	Floyd-Warshall — \(\Theta(n^3)\)
All-pairs, sparse graph, arbitrary weights	Johnson’s algorithm — \(O(nm + n^2 \log n)\), using Bellman-Ford as preprocessing

References

Bellman, Richard. 1958. “On a Routing Problem.” Quarterly of Applied Mathematics 16 (1): 87–90. https://doi.org/10.1090/qam/102435.

Ford, Lester R. 1956. “Network Flow Theory.” RAND Corporation Paper P-923. https://www.rand.org/pubs/papers/P923.html.

Kleinberg, Jon, and Éva Tardos. 2005. Algorithm Design. Pearson/Addison-Wesley. https://www.pearson.com/en-us/subject-catalog/p/algorithm-design/P200000003259.