Forward Chaining

Motivation

Forward chaining is a data-driven inference strategy: start from what is known and repeatedly apply rules to derive new facts until the goal is reached or no further progress is possible (Russell and Norvig 2020). It is the natural execution model for production systems and monitoring applications, where the set of known facts grows over time and the system must react to new information.

Problem

Given a definite Horn clause knowledge base and a query atom \(q\), decide whether \(q\) is entailed by the knowledge base.

A definite Horn clause knowledge base consists of:

Facts: ground atoms, e.g., \(\text{mammal}(\text{bat})\).
Rules: implications with a conjunction of atoms on the left and a single atom on the right, e.g., \(\text{mammal}(x) \land \text{flies}(x) \Rightarrow \text{unusual}(x)\).

Key Ideas

Fire each rule when all its premises are known

A rule \(p_1 \land \dots \land p_n \Rightarrow q\) becomes a derivation step the moment every premise \(p_i\) is in the known set. Forward chaining maintains the invariant that every derivable atom has been derived by re-checking each rule as soon as its last missing premise arrives.

Bottom-up, goal-agnostic propagation

The algorithm proceeds bottom-up: starting from base facts, it derives every consequence — not just the goal. This is the strength on streaming/monitoring workloads (a new fact triggers all its consequences) and the weakness on focused queries (most derivations are irrelevant to the goal).

One pass per fact

Each newly derived fact is enqueued once. When popped, it triggers checks of the rules that mention it, and any newly-firable rules add new facts. Because no fact is enqueued twice, the algorithm runs in time proportional to the total rule-firing work, with no exponential search blow-up.

Algorithm

known ← all base facts
agenda ← all base facts
while agenda is non-empty:
    p ← pop(agenda)
    for each rule (p₁ ∧ … ∧ pₙ ⇒ q) with pᵢ = p for some i:
        if all other premises p₁, …, pₙ are in known:
            if q ∉ known:
                add q to known and agenda
if goal ∈ known: succeed  else: fail

Each fact is processed at most once.

Walkthrough

Deriving a property in a small KB

Facts: \(\text{mammal}(\text{bat})\), \(\text{flies}(\text{bat})\), \(\text{mammal}(\text{dog})\).

Rules:

\(\text{mammal}(x) \land \text{flies}(x) \Rightarrow \text{unusual}(x)\)
\(\text{unusual}(x) \Rightarrow \text{interesting}(x)\)

Goal: \(\text{interesting}(\text{bat})\).

Step	Pop	Rules examined	New facts derived
1	\(\text{mammal}(\text{bat})\)	rule 1 (with \(x = \text{bat}\)): \(\text{flies}(\text{bat})\) missing	—
2	\(\text{flies}(\text{bat})\)	rule 1 (with \(x = \text{bat}\)): all premises present	\(\text{unusual}(\text{bat})\)
3	\(\text{mammal}(\text{dog})\)	rule 1 (with \(x = \text{dog}\)): \(\text{flies}(\text{dog})\) missing	—
4	\(\text{unusual}(\text{bat})\)	rule 2 (with \(x = \text{bat}\)): premise present	\(\text{interesting}(\text{bat})\)
5	\(\text{interesting}(\text{bat})\)	no rule has this on the left	—

Goal \(\text{interesting}(\text{bat})\) is in the known set: succeed. Note that \(\text{unusual}(\text{dog})\) and \(\text{interesting}(\text{dog})\) were never derived because \(\text{flies}(\text{dog})\) never appeared — forward chaining stops at the boundary of what the base facts support.

Correctness

Soundness. Every fact added to the known set is derived by a rule whose premises are themselves in the known set, which by induction were either base facts or correctly derived earlier. So everything in the known set is entailed by the KB.

Completeness. Forward chaining is complete for definite Horn clauses: every ground atom entailed by the knowledge base will be derived. (proof)

Termination. Each derived fact is enqueued exactly once, and the Herbrand base of a function-free KB is finite, so the agenda empties in finite time.

Complexity and Tradeoffs

Let \(|KB|\) denote the total size of the knowledge base (facts and rule bodies). The total work is \(O(|KB|^2)\) in the worst case — each fact triggers checks across rules that mention it, and re-checking takes time proportional to the rule body.

The algorithm is data-driven, so it always pays for deriving all consequences of the KB regardless of which goal was asked. For workloads where the same KB is queried repeatedly with different goals, the up-front cost is amortized; for one-off goal queries on large KBs, backward chaining is usually preferable.

Connection to production systems

Forward chaining is the inference strategy implemented by production systems. The working memory corresponds to \(\mathit{known}\), and rules fire when all their conditions are in working memory.

Connection to Datalog

Datalog extends forward chaining to first-order Horn clauses without function symbols. Because function symbols are absent, the Herbrand base is finite, and forward chaining reaches a fixed point in finite time. Datalog is used in deductive databases and program analysis.

When to Use It

Situation	Approach
Monitoring / streaming systems (new facts arrive over time)	Forward chaining — derive consequences eagerly as facts come in.
Production / rule-based systems	Forward chaining — the canonical execution model.
Datalog / deductive databases	Forward chaining (with semi-naive evaluation for efficiency).
Focused goal queries on a large KB	Backward chaining — avoids deriving irrelevant facts.
KB with non-Horn clauses or function symbols	Use general resolution; forward chaining requires definite Horn form.

References

Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. 4th ed. Pearson. https://www.pearson.com/en-us/subject-catalog/p/artificial-intelligence-a-modern-approach/P200000003500/9780137505135.