Logical Entailment

Motivation

When we say a reasoner is correct, we mean it produces conclusions that genuinely follow from its premises — not merely consequences it happens to compute. Logical entailment is the standard against which correctness is measured: a precise, semantic notion of “follows from” defined entirely in terms of meaning, with no reference to any particular algorithm (Enderton 2001).

The split between what follows and how to derive it is the foundational distinction of mathematical logic. Entailment is the semantic side; proof is the syntactic side. The whole apparatus of soundness, completeness, and decidability quantifies how well a proof system tracks entailment.

Definition

Fix a logical language with sentences \(\phi, \psi, \ldots\) and a class of interpretations (or models) \(\mathcal{M}\). Each interpretation assigns truth values to atomic claims; truth of compound sentences is built up compositionally. Write \(M \models \phi\) to mean “\(\phi\) is true in \(M\).”

A set of sentences \(\Gamma\) entails a sentence \(\phi\), written \(\Gamma \models \phi\), if every model that makes all of \(\Gamma\) true also makes \(\phi\) true:

\[\Gamma \models \phi \quad \iff \quad \text{for every } M \in \mathcal{M},\ \text{if } M \models \gamma \text{ for all } \gamma \in \Gamma,\ \text{then } M \models \phi.\]

Equivalently: it is impossible to make \(\Gamma\) true and \(\phi\) false simultaneously.

The relation \(\models\) depends only on what models exist and which sentences they satisfy — not on any inference procedure. Two sentences with the same models are interchangeable; two proof systems for the same logic agree on \(\models\) even if their syntactic derivations look nothing alike.

Examples and Non-Examples

Propositional entailment

Let \(\Gamma = \{p \Rightarrow q,\ p\}\). The only model where both members of \(\Gamma\) are true is one where \(p = \top\) and \(q = \top\); that model also satisfies \(q\). So \(\Gamma \models q\). This is the entailment underwriting modus ponens.

By contrast, \(\{p \Rightarrow q,\ q\} \not\models p\): a model with \(p = \bot,\ q = \top\) makes \(\Gamma\) true but not \(p\).

First-order entailment

Let \(\Gamma\) contain \(\forall x.\, \text{Human}(x) \Rightarrow \text{Mortal}(x)\) and \(\text{Human}(\text{Socrates})\). Every model that satisfies both must, by interpreting the universal, also satisfy \(\text{Mortal}(\text{Socrates})\). So \(\Gamma \models \text{Mortal}(\text{Socrates})\).

Vacuous entailment

A set of sentences that has no models — an unsatisfiable \(\Gamma\) — entails every sentence. There is no model where \(\Gamma\) is true and \(\phi\) is false, simply because there is no model where \(\Gamma\) is true at all. This is called ex falso: a contradiction proves anything.

Companion Notions

A handful of related properties recur throughout logic:

Satisfiability. \(\Gamma\) is satisfiable if some model satisfies all of it. A single sentence \(\phi\) is satisfiable iff \(\{\phi\}\) is.
Validity. \(\phi\) is valid if every model satisfies it, i.e., \(\emptyset \models \phi\). Tautologies are the propositional case.
Equivalence. \(\phi\) and \(\psi\) are equivalent if each entails the other; the symbol \(\equiv\) is sometimes used.

The three are interchangeable:

\[\Gamma \models \phi \iff \Gamma \cup \{\lnot \phi\} \text{ is unsatisfiable} \iff \text{the conjunction } \bigwedge \Gamma \Rightarrow \phi \text{ is valid}.\]

The first equivalence is the basis of refutation-based reasoning: prove \(\phi\) by showing that adding \(\lnot \phi\) to the premises makes the whole set unsatisfiable. Resolution and DPLL exploit this directly.

Proof Systems and the Soundness/Completeness Pair

A proof system specifies syntactic rules for deriving sentences from other sentences. Write \(\Gamma \vdash \phi\) for “there is a derivation of \(\phi\) from \(\Gamma\) in the system.”

Two properties relate \(\vdash\) to \(\models\):

Soundness. Every derivation produces a genuine consequence: \(\Gamma \vdash \phi \Rightarrow \Gamma \models \phi\).
Completeness. Every genuine consequence has a derivation: \(\Gamma \models \phi \Rightarrow \Gamma \vdash \phi\).

Together they say the proof system is exactly the right tool for tracking entailment in that logic. Sound but incomplete systems miss conclusions; complete but unsound systems hallucinate them. Most useful proof systems aim for both.

The proof procedures from earlier in the course are evaluated on this scale:

Forward chaining is sound and complete for definite Horn clauses (proof).
Resolution is sound and refutation-complete for first-order logic (proof).
DPLL and modern SAT solvers are sound and complete for propositional logic.

Decidability and the Limits of Mechanical Reasoning

A logic is decidable if there is an algorithm that, for every \(\Gamma\) and \(\phi\), halts and returns a correct yes/no answer to “\(\Gamma \models \phi\)?” Decidability is a property of the logic, not of a particular proof system.

Logic	Entailment is…	Notes
Propositional logic	Decidable	NP-complete (validity of \(\bigwedge \Gamma \Rightarrow \phi\)); co-NP-complete for entailment
Description logics	Decidable	Complexity ranges PTime to NExpTime by dialect
First-order logic	Semi-decidable	If \(\Gamma \models \phi\), a procedure can confirm in finite time; if not, it may run forever
Second-order logic, arithmetic	Undecidable	No mechanical procedure suffices

Semi-decidability is what resolution achieves for first-order logic. Validity in first-order logic is recursively enumerable but not recursive: you can enumerate all proofs, but you cannot in general decide whether a proof exists.

The boundary at first-order logic is fundamental. It is why richer-than-FOL systems (Peano arithmetic, set theory) admit no complete proof procedure — Gödel’s incompleteness theorems sit just past this point.

Why Entailment Is the Right Notion for KR

A knowledge base is intended to model a domain; what counts as a consequence of the KB should reflect the domain, not the syntactic conveniences of how it was written. Entailment captures exactly this — two KBs with identical models entail identical sentences, regardless of the syntactic form they take.

This abstraction is what lets knowledge graphs, ontologies, and rule-based systems coexist: each is a different syntactic vehicle for declaring a set of models, and all are queried against the same semantic standard. When a neurosymbolic system uses a symbolic verifier to check a neural proposal, the property the verifier is checking is entailment by the formal theory it implements — the network supplies candidates, the verifier supplies semantic correctness.

Beyond Classical Entailment

Several variants relax the classical setting for cases where it is too rigid:

Closed-world reasoning. Treat the absence of \(\phi\) as evidence for \(\lnot \phi\) (the convention in databases). Formalized via circumscription or stable-model semantics.
Default reasoning. Allow defeasible inferences (“birds typically fly”) that are withdrawn when contradicting facts arrive.
Probabilistic entailment. Generalize \(\models\) from \(\{0,1\}\) to probabilities — Bayesian networks and probabilistic logic programs follow this path.
Modal and temporal logics. Add operators for necessity, possibility, “always,” “eventually.” Entailment is defined over Kripke-style structures that include accessibility relations.

Each variant is interesting precisely because classical entailment is so strict: the variants live or die by how clearly their semantic definitions explain the inferences they license.

References

Enderton, Herbert B. 2001. A Mathematical Introduction to Logic. 2nd ed. Academic Press. https://www.sciencedirect.com/book/9780122384523/a-mathematical-introduction-to-logic.