Correctness of the Kosaraju–Sharir Algorithm
Claim
Run a complete DFS on a directed graph \(G = (V, E)\) and let \(f(v)\) denote the finish time of vertex \(v\). Process vertices in decreasing order of \(f\), running a fresh DFS in the transpose \(G^T\) from each still-unvisited popped vertex. Each such DFS visits exactly the strongly connected component containing its starting vertex, and the SCCs are emitted in topological order of the condensation (Sharir 1981; Cormen et al. 2009).
Proof
Write \(u \rightsquigarrow v\) for “\(v\) is reachable from \(u\) in \(G\).” For a set \(S \subseteq V\), write \(f(S) = \max_{v \in S} f(v)\).
Lemma (finish-time order respects the condensation)
Let \(C \neq C'\) be SCCs of \(G\) and suppose there is an edge \(C \to C'\) in the condensation (i.e., some \(u \in C, v \in C'\) with \(u \to v \in E\)). Then \(f(C) > f(C')\).
Proof. Let \(x \in C \cup C'\) be the vertex that DFS visits first. Two cases.
Case 1: \(x \in C\). When DFS enters \(x\), every vertex of \(C \cup C'\) is white (unvisited). From \(x\), DFS can reach all of \(C\) (since \(C\) is strongly connected, every vertex of \(C\) is reachable from \(x\)) and all of \(C'\) (via \(x \rightsquigarrow u \to v \rightsquigarrow w\) for any \(w \in C'\), since \(C'\) is strongly connected and we can reach \(u\) from \(x\) inside \(C\)). The white-path theorem of DFS therefore makes every vertex of \(C \cup C'\) a descendant of \(x\) in the DFS forest. Hence \(x\) finishes last among \(C \cup C'\), so \(f(x) = f(C \cup C') = \max(f(C), f(C'))\). Since \(x \in C\), this gives \(f(C) \geq f(x) > f(C')\) (strict because \(x \notin C'\)).
Case 2: \(x \in C'\). When DFS enters \(x\), every vertex of \(C'\) is white, but no vertex of \(C\) is reachable from \(x\) — there is no path from \(C'\) to \(C\) in \(G\), because such a path together with \(C \to C'\) would force \(C\) and \(C'\) to merge into a single SCC, contradicting \(C \neq C'\). So DFS from \(x\) visits all of \(C'\) but no vertex of \(C\), and \(x\) (along with the rest of \(C'\)) finishes before DFS ever enters any vertex of \(C\). Therefore \(f(C') < f(v)\) for every \(v \in C\) that is visited later, so \(f(C') < f(C)\).
In both cases \(f(C) > f(C')\). \(\square\)
Main result
Let \(v_1, v_2, \dots, v_n\) be the vertices in decreasing finish-time order. We process them in this order on \(G^T\).
Claim: when \(v_i\) is popped and is unvisited in the second pass, DFS from \(v_i\) in \(G^T\) visits exactly the SCC of \(G\) containing \(v_i\).
Call that SCC \(C\).
DFS visits all of \(C\): Inside \(C\) every two vertices are mutually reachable in \(G\), hence also in \(G^T\). None of \(C\)’s vertices has been visited yet in the second pass: by the lemma, every SCC reachable in \(G\) from \(C\) via a non-trivial condensation path \(C \to C'' \to \dots\) has strictly smaller \(f\) than \(f(C)\), and every SCC that condensation-reaches \(C\) has strictly larger \(f(C)\). Among the SCCs already processed in earlier iterations, none can be \(C\) itself; and we are at the first unvisited vertex of \(C\). So DFS in \(G^T\) from \(v_i\) visits all of \(C\).
DFS does not leave \(C\): Suppose \(G^T\) has an edge from some \(w \in C\) to some \(w' \notin C\). Then \(G\) has an edge \(w' \to w\), so the condensation of \(G\) has an edge \(C' \to C\) where \(C'\) is \(w'\)’s SCC. By the lemma, \(f(C') > f(C) \geq f(v_i)\). So every vertex of \(C'\) has a finish time greater than \(f(v_i)\); they were all popped before \(v_i\). They were either visited in earlier second-pass DFS calls (in which case \(w'\) is already marked, and the edge \(w \to w'\) in \(G^T\) does nothing) or — by the same argument applied recursively to whichever SCC contained the popping root — they were visited as part of some earlier component. Either way, when DFS in \(G^T\) encounters \(w'\) from \(w\), \(w'\) is already marked, so DFS does not cross from \(C\) to \(C'\).
Therefore the DFS rooted at \(v_i\) in \(G^T\) visits exactly \(C\). \(\square\)
Topological order of the condensation
The SCC containing the vertex with the overall largest finish time has no incoming condensation edge (any such edge would, by the lemma, force a finish time larger still). So the first SCC emitted is a source of the condensation. After removing it, the same argument applies to the remaining graph: the next emitted SCC is a source of what is left. Iterating, the SCCs are emitted in a topological order of the condensation. \(\square\)