Dijkstra’s Algorithm

Motivation

Dijkstra’s algorithm finds the shortest-path distances from a fixed source to every other vertex in a weighted graph in \(O(m \log n)\) time — fast enough to power GPS navigation, internet packet forwarding, and network analysis in production (Dijkstra 1959; Kleinberg and Tardos 2005).

The shortest-path problem has two complications: negative edges break greedy finalization, and negative cycles make distances undefined. Dijkstra sidesteps both by requiring non-negative edge weights — the common case for physical distances, latencies, and costs. When that assumption holds, it is asymptotically faster than the general-purpose Bellman-Ford, which handles arbitrary weights at \(\Theta(nm)\).

Problem

Let \(G = (V, E)\) be a directed graph with non-negative edge weights \(w : E \to \mathbb{R}_{\ge 0}\), and fix a source \(s \in V\). The shortest-path distance from \(s\) to \(v\) is \[ d(v) = \min_{P : s \to v} \sum_{e \in P} w(e), \] or \(+\infty\) if no path exists.

The non-negativity assumption is critical. Dijkstra is incorrect on graphs with negative edges.

Key Ideas

Breadth-first search finds shortest paths in unweighted graphs by visiting vertices in order of increasing hop count, finalizing each one on first visit. The reason BFS can finalize so eagerly is that with unit weights, no later-discovered path can be shorter than the first one found. Dijkstra extends this idea to arbitrary non-negative weights by replacing the FIFO queue with a priority queue keyed on tentative distance.

The algorithm rests on a single invariant: when a vertex is extracted from the priority queue, its tentative distance is already optimal and will never improve.

The reasoning is a one-step argument. Suppose \(u\) is about to be extracted with tentative distance \(D[u]\). Any path from \(s\) to \(u\) that has not yet been fully explored must pass through some unexplored vertex \(y\). At that moment, \(D[y] \ge D[u]\) (otherwise \(y\) would have been extracted first). Since all remaining edges have non-negative weight, extending any path through \(y\) onward to \(u\) costs at least \(D[y] \ge D[u]\). So no unexplored path can beat \(D[u]\).

This is why non-negative weights are essential: a single negative edge could let a path through a higher-tentative-distance vertex sneak in under the current minimum, invalidating the finalization. The negative-edge walkthrough below shows exactly this failure.

The greedy choice — always extract the smallest tentative distance — is what makes the invariant self-reinforcing. Each extraction is provably final, so the algorithm never needs to revisit a vertex. When all edge weights are equal, Dijkstra reduces to BFS, with the priority queue degenerating into a FIFO ordering by hop count.

Algorithm

Maintain a tentative distance estimate \(D[v]\) for each vertex, initialized to \(D[s] = 0\) and \(D[v] = +\infty\) otherwise. Maintain a set \(S\) of vertices whose distance is final.

D[s] = 0, D[v] = +infinity for v != s
S = {}
PQ = priority queue of all vertices keyed by D
while PQ is non-empty:
    u = PQ.extract-min()
    S.add(u)
    for each edge (u, v) with weight w:
        if D[u] + w < D[v]:
            D[v] = D[u] + w
            PQ.decrease-key(v, D[v])

The relaxation step D[v] = min(D[v], D[u] + w) is shared by all shortest-path algorithms. Dijkstra’s contribution is the order: always next process the vertex with the smallest current tentative distance.

Walkthrough

Shortest paths from s

A graph with five vertices and non-negative edge weights. Highlighted in green is the shortest-path tree from \(s\).

State of \(D[v]\) and the priority queue at each iteration. A dash means the value is unchanged from the previous row; once a vertex is extracted, its \(D\) value is final.

Step	Extract	\(D[s]\)	\(D[a]\)	\(D[b]\)	\(D[c]\)	\(D[t]\)	PQ (unfinalized)	Note
init	—	\(0\)	\(\infty\)	\(\infty\)	\(\infty\)	\(\infty\)	\(\{s, a, b, c, t\}\)
1	\(s\) (\(D{=}0\))	—	\(4\)	\(2\)	—	—	\(\{b{:}2, a{:}4, c{:}\infty, t{:}\infty\}\)	relax \(s\to a\), \(s\to b\)
2	\(b\) (\(D{=}2\))	—	\(3\)	—	\(7\)	—	\(\{a{:}3, c{:}7, t{:}\infty\}\)	\(b\to a\) improves \(a\) from \(4\) to \(3\); \(b\to c\)
3	\(a\) (\(D{=}3\))	—	—	—	\(6\)	—	\(\{c{:}6, t{:}\infty\}\)	\(a\to c\) improves \(c\) from \(7\) to \(6\)
4	\(c\) (\(D{=}6\))	—	—	—	—	\(7\)	\(\{t{:}7\}\)	relax \(c\to t\)
5	\(t\) (\(D{=}7\))	—	—	—	—	—	\(\{\}\)	done

The relaxation through \(b\) at step 2 is the interesting step: \(D[a]\) drops from \(4\) to \(3\) because the indirect path \(s \to b \to a\) (weight \(2+1=3\)) beats the direct edge \(s \to a\) (weight \(4\)). Without that update, Dijkstra would later report \(d(a) = 4\).

Why negative edges break Dijkstra

Same shape of graph, but with one negative edge. Here \(s \to a\) has weight \(1\), \(s \to b\) has weight \(5\), \(a \to t\) has weight \(2\), and \(b \to t\) has weight \(-4\).

True distances: \(d(a)=1\), \(d(b)=5\), \(d(t) = \min(1+2,\ 5-4) = 1\), achieved via \(s \to b \to t\).

Step	Extract	\(D[s]\)	\(D[a]\)	\(D[b]\)	\(D[t]\)	PQ (unfinalized)	Note
init	—	\(0\)	\(\infty\)	\(\infty\)	\(\infty\)	\(\{s, a, b, t\}\)
1	\(s\) (\(D{=}0\))	—	\(1\)	\(5\)	—	\(\{a{:}1, b{:}5, t{:}\infty\}\)	relax \(s\to a\), \(s\to b\)
2	\(a\) (\(D{=}1\))	—	—	—	\(3\)	\(\{t{:}3, b{:}5\}\)	relax \(a\to t\)
3	\(t\) (\(D{=}3\))	—	—	—	—	\(\{b{:}5\}\)	finalized at 3, but \(d(t)=1\)
4	\(b\) (\(D{=}5\))	—	—	—	(final)	\(\{\}\)	\(b\to t\) would give \(5-4=1\) — too late

Dijkstra reports \(D[t] = 3\), but the correct answer is \(1\). The fault is at step 3: \(t\) is extracted and finalized before the path through \(b\) has been considered. The Key Ideas argument fails because the unexplored vertex \(b\) (with \(D[b]=5\)) can reach \(t\) through the negative edge at total cost \(5 + (-4) = 1 < D[t]=3\). Non-negativity is exactly what would have ruled this out.

Correctness

Invariant. When a vertex \(u\) is extracted, \(D[u] = d(u)\). (proof)

Proof by induction on the order of extraction. Suppose \(u\) is extracted with \(D[u] > d(u)\), and consider the true shortest path \(P\) from \(s\) to \(u\). Walking along \(P\), let \(y\) be the first vertex not yet in \(S\), and let \(x\) be its predecessor (with \(x \in S\), since \(s \in S\) from iteration one). By the inductive hypothesis \(D[x] = d(x)\), and the relaxation of edge \((x, y)\) has set \(D[y] \le d(x) + w(x, y) = d(y)\). Since edge weights are non-negative, \(d(y) \le d(u)\), so \[ D[y] \le d(y) \le d(u) < D[u]. \] But the priority queue would have extracted \(y\) before \(u\) — contradiction. So \(D[u] = d(u)\) when extracted.

The non-negativity assumption is used in the step \(d(y) \le d(u)\): walking further along a path can only increase the distance.

Complexity and Tradeoffs

The number of operations is:

\(n\) extract-min operations.
\(m\) decrease-key operations (one per edge).

With a binary heap: extract-min and decrease-key both cost \(O(\log n)\), giving \(O((n + m) \log n)\). With a Fibonacci heap the bound improves to \(O(m + n \log n)\), asymptotically optimal among comparison-based implementations on dense graphs.

For graphs with small integer weights, faster algorithms (Thorup’s \(O(m + n)\) for undirected with positive integer weights) exist but are rarely needed in practice.

When to Use It

Single-pair shortest path: stop the loop as soon as the destination is extracted. A* search (described separately) accelerates this further with a heuristic.
All-pairs shortest paths on non-negative graphs: run Dijkstra from each vertex, \(O(n (n + m) \log n)\). Faster than Floyd–Warshall on sparse graphs.
Bidirectional Dijkstra: search forward from \(s\) and backward from \(t\), meeting in the middle. Halves the explored region in many practical cases.

References

Dijkstra, E. W. 1959. “A Note on Two Problems in Connexion with Graphs.” Numerische Mathematik 1 (1): 269–71. https://doi.org/10.1007/bf01386390.

Kleinberg, Jon, and Éva Tardos. 2005. Algorithm Design. Pearson/Addison-Wesley. https://www.pearson.com/en-us/subject-catalog/p/algorithm-design/P200000003259.