Backward Chaining
Motivation
Backward chaining is a goal-directed inference strategy: start from the query and work backward through rules, reducing the goal to subgoals, until all subgoals are known facts (Russell and Norvig 2020). It focuses computation on rules relevant to the goal, avoiding the derivation of irrelevant facts that forward chaining would produce.
Problem
Given a definite Horn clause knowledge base (facts and rules \(p_1 \land \cdots \land p_n \Rightarrow q\)) and a query atom \(g\) that may contain variables, return a substitution \(\theta\) such that \(\theta(g)\) is entailed by the KB, or report failure if no such substitution exists.
Backward chaining uses unification (see resolution) to match goals against rule heads containing variables.
Key Ideas
Reduce a goal to subgoals
If the goal \(g\) matches a known fact, we are done. Otherwise, find a rule \(p_1 \land \cdots \land p_n \Rightarrow q\) whose head \(q\) unifies with \(g\). The proof of \(g\) now reduces to proving each premise \(p_i\) under the unifier — turning a single goal into \(n\) smaller goals that share variable bindings through the substitution.
This recursive structure is depth-first search on the proof tree. Each rule choice is a branching point; each premise becomes a recursive call.
Why work backward from the goal
Forward chaining derives every consequence of the KB, then checks whether the goal happens to appear. Backward chaining only ever follows rules whose head unifies with a current goal — so a rule that has nothing to do with \(g\) is never even examined. On large KBs with focused queries this prunes most of the work.
Depth-first means incomplete in general
Standard backward chaining explores rule choices in the order they appear, depth-first. When a branch fails, it backtracks and tries the next applicable rule. This is the execution model of Prolog, where the knowledge base is a Prolog program and queries are goals.
The trade-off: depth-first search can plunge into an infinite recursion on left-recursive rules and never return to try alternatives that would have succeeded. So backward chaining trades completeness for goal-focused efficiency.
Algorithm
To prove goal \(g\) with current substitution \(\theta\):
prove(g, θ):
if θ(g) is a known fact: return θ
for each rule (p₁ ∧ … ∧ pₙ ⇒ q) in KB (with fresh variable names):
θ' ← unify(g, q)
if θ' succeeds:
θ'' ← prove(θ'(p₁), θ' ∘ θ)
...
θ''' ← prove(θ'(pₙ), …)
if all subgoals succeed: return composed substitution
fail
The algorithm returns a substitution \(\theta\) such that \(\theta(g)\) follows from \(KB\), or fails if no proof exists. The order in which rules are tried affects efficiency and termination — Prolog tries rules in the order they are written.
Walkthrough
Proving a goal by recursive subgoal reduction
Facts: \(\text{parent}(\text{alice}, \text{bob})\), \(\text{parent}(\text{bob}, \text{carol})\).
Rules:
- \(\text{parent}(x, y) \Rightarrow \text{ancestor}(x, y)\)
- \(\text{parent}(x, z) \land \text{ancestor}(z, y) \Rightarrow \text{ancestor}(x, y)\)
Goal: \(\text{ancestor}(\text{alice}, \text{carol})\).
| Step | Current goal | Action | Substitution |
|---|---|---|---|
| 1 | \(\text{ancestor}(\text{alice}, \text{carol})\) | Try rule 1: unify head \(\text{ancestor}(x, y)\) → \(\{x = \text{alice}, y = \text{carol}\}\). Subgoal: \(\text{parent}(\text{alice}, \text{carol})\). | \(\{x=\text{alice}, y=\text{carol}\}\) |
| 2 | \(\text{parent}(\text{alice}, \text{carol})\) | Not a known fact, no rule with this head. Fail; backtrack. | — |
| 3 | \(\text{ancestor}(\text{alice}, \text{carol})\) | Try rule 2: unify head → \(\{x = \text{alice}, y = \text{carol}, z = ?\}\). Subgoals: \(\text{parent}(\text{alice}, z)\) and \(\text{ancestor}(z, \text{carol})\). | \(\{x=\text{alice}, y=\text{carol}\}\) |
| 4 | \(\text{parent}(\text{alice}, z)\) | Matches fact \(\text{parent}(\text{alice}, \text{bob})\). Bind \(z = \text{bob}\). | add \(\{z=\text{bob}\}\) |
| 5 | \(\text{ancestor}(\text{bob}, \text{carol})\) | Recurse. Try rule 1: subgoal \(\text{parent}(\text{bob}, \text{carol})\). | — |
| 6 | \(\text{parent}(\text{bob}, \text{carol})\) | Matches fact. Success. | — |
The proof completes: \(\text{ancestor}(\text{alice}, \text{carol})\) holds because \(\text{alice}\) is a parent of \(\text{bob}\), who is an ancestor of \(\text{carol}\). Notice rule 1 was tried first (step 1) and failed before rule 2 succeeded — the backtracking trace is part of normal operation.
Correctness
Soundness. Each step of backward chaining either matches a known fact or unifies the goal with a rule head and recursively proves the resulting subgoals under a composed substitution. Both moves correspond to valid applications of Horn-clause inference, so the chain of reasoning is correct: every answer substitution \(\theta\) returned satisfies \(KB \models \theta(g)\).
Incompleteness. Unlike forward chaining, backward chaining is not guaranteed to terminate. Left-recursive rules such as \(\text{ancestor}(x, z) \land \text{parent}(z, y) \Rightarrow \text{ancestor}(x, y)\) ordered before the base case can cause depth-first search to loop forever, missing answers that exist. Prolog’s depth-first strategy can be incomplete in this sense even when the knowledge base has a finite answer set. Adding occurs-check on unification and breadth-first or iterative-deepening search restores completeness at the cost of more work.
Complexity and Tradeoffs
In the worst case, backward chaining can explore an exponential proof tree (one branch per applicable rule, branching at every subgoal). On focused queries where few rules apply at each step, the search is far smaller — this is the key benefit over forward chaining, which always derives every consequence.
Backward chaining is the execution model of Prolog and the inference strategy of most rule-based expert systems and logic-programming languages. Memoization (tabling, in Prolog terms) avoids redundant work by caching subgoal results and detecting recursive calls, restoring completeness on a wide class of programs.
When to Use It
| Situation | Approach |
|---|---|
| One-off goal queries on a large KB | Backward chaining — derives only goal-relevant facts. |
| Prolog programming | Backward chaining (depth-first with rule-order tiebreaking). |
| Repeated queries on a stable KB | Forward chaining — precomputes all consequences. |
| Left-recursive rules / completeness required | Use tabling, iterative deepening, or resolution. |
| Streaming data, monitoring | Forward chaining — react to new facts. |
Comparison with Forward Chaining
| Forward chaining | Backward chaining | |
|---|---|---|
| Direction | Data → goal | Goal → data |
| Completeness | Yes (Horn clauses) | No (may loop) |
| Focus | Derives all consequences | Derives only goal-relevant facts |
| Typical use | Monitoring, alerting, Datalog | Query answering, Prolog |
Forward chaining is preferable when many queries will be posed against a stable knowledge base, since all consequences are precomputed. Backward chaining is preferable for one-off queries where most of the knowledge base is irrelevant.