Depth-First Search

Motivation

Depth-first search (DFS) explores a graph by following each branch as far as possible before backtracking (Russell and Norvig 2020). It uses memory proportional to the depth of the current path rather than the entire frontier, making it practical for deep or large graphs where breadth-first search would exhaust memory.

Problem

A search problem consists of:

  • A graph \(G = (V, E)\), directed or undirected
  • A start node \(s \in V\)
  • A goal test \(\text{goal} : V \to \{\text{true}, \text{false}\}\)

The depth of a node is its distance from \(s\) in number of edges. The branching factor \(b\) is the maximum number of successors of any node, and \(m\) is the maximum depth of the graph.

Algorithm

DFS maintains a stack (LIFO) of nodes to visit.

  1. Push \(s\); mark \(s\) as visited.
  2. While the stack is non-empty:
    1. Pop node \(v\).
    2. If \(\text{goal}(v)\), return \(v\).
    3. For each unvisited neighbor \(w\) of \(v\) (in reverse order to preserve left-to-right expansion): mark \(w\) visited and push \(w\).
  3. Return failure.

Equivalently, DFS can be implemented recursively: visit \(v\), then recurse on each unvisited neighbor. The call stack plays the role of the explicit stack.

Walkthrough

Pre- and post-order numbering

Run DFS from \(s\) on the same graph used to illustrate BFS, expanding neighbors in alphabetical order. Each node is labeled with \(\text{pre}/\text{post}\) — the order in which DFS first enters and finally leaves the node.

s 1/16 a 2/15 b 4/13 c 6/11 d 3/14 e 5/12 f 7/10 g 8/9 DFS tree edge non-tree (back) edge

The pre-order \(s, a, d, b, e, c, f, g\) is the DFS visit order, contrasting with the BFS dequeue order \(s, a, b, c, d, e, f, g\). DFS dives all the way to \(g\) before fully expanding any of \(s\)’s direct neighbors — at depth \(3\) in the tree, \(g\) is reached before \(b\) has finished exploring. The post-order \(g, f, c, e, b, d, a, s\) is the order in which recursive calls return; this reverse-dependency order is the basis of topological sort.

Cycle detection on a directed graph

Run DFS on the directed graph below, coloring each node white (unvisited), grey (on the current recursion stack), or black (finished). A back edge — one pointing to a grey ancestor — witnesses a cycle.

Edges: A→B, A→C, B→D, C→D, D→B

Processing in alphabetical order:

Step Action White Grey Black
1 visit A B C D A
2 visit B (from A) C D A B
3 visit D (from B) C A B D
4 edge D→B: B is grey
4 back edge found — cycle confirmed

At step 4, D attempts to follow D→B, but B is grey (still on the stack). A grey successor means the current path leads back to an ancestor, forming a cycle: \(B \to D \to B\).

If the graph were acyclic — remove the D→B edge — DFS would finish B and D (turning them black) before backtracking to A, and no grey successor would ever be encountered.

Complexity and Tradeoffs

Time complexity. On an explicit graph \(G = (V, E)\), DFS visits each vertex once and inspects each edge a constant number of times, giving \(O(V + E)\). When the graph is implicit and bounded by branching factor \(b\) and maximum depth \(m\) — as in tree-style search — the same worst-case bound expressed in those parameters is \(O(b^m)\).

Space complexity. On an explicit graph, DFS stores the visited set and recursion stack, both at most \(O(V)\). In tree-style search only the current path and its unexpanded siblings need to be stored, \(O(bm)\). This is the primary advantage over BFS.

Important tradeoffs:

  • Completeness: DFS is complete on finite graphs with cycle detection. Without cycle detection it may loop forever on graphs with cycles. On infinite graphs DFS may explore an infinite branch and never find a solution at finite depth.
  • Optimality: DFS is not optimal — it returns the first solution found along the chosen path, which is not necessarily the shallowest or cheapest. (proof)
  • Memory vs. exhaustiveness: the \(O(bm)\) frontier is much smaller than BFS’s \(O(b^d)\), but the price is exploring deep dead ends before shallow goals.

When to Use It

DFS is the right choice when memory is the binding constraint, when the entire reachable graph must be visited, or when the order of visits matters (pre-order or post-order).

  • Find one node with a property \(P\): the standard algorithm returns the first node popped where \(P(v)\) holds. The result is not generally the nearest such node — use BFS when shallowest is required.
  • Find all nodes with a property \(P\): remove the early-return and collect every \(v\) where \(P(v)\) holds. DFS visits every reachable node exactly once (with cycle detection), so the collection is complete.
  • Dependency resolution and topological sort: post-order numbering reverses the dependency order.
  • Cycle detection on directed and undirected graphs, via the white/grey/black coloring shown above.
  • Bounded-depth exploration: use depth-limited search or iterative deepening when the depth of solutions is unknown but finite.

References

Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. 4th ed. Pearson. https://www.pearson.com/en-us/subject-catalog/p/artificial-intelligence-a-modern-approach/P200000003500/9780137505135.