SAT is NP-Complete

Statement

The Boolean satisfiability problem is NP-complete: it is in NP, and every problem in NP reduces to it in polynomial time (Cook 1971).

NP Membership

Given a formula \(\phi\) and a proposed assignment \(\alpha\), verify by evaluating \(\phi\) under \(\alpha\) in time \(O(|\phi|)\). If the result is true, \(\alpha\) is a witness. So SAT \(\in\) NP.

NP-Hardness

Let \(L \in \text{NP}\). Then there exists a non-deterministic Turing machine \(M\) and a polynomial \(p\) such that for every input \(x\) of length \(n\):

\[x \in L \iff M \text{ accepts } x \text{ in at most } p(n) \text{ steps.}\]

We construct a Boolean formula \(\varphi_x\) that is satisfiable if and only if \(M\) accepts \(x\).

Variables

A computation of \(M\) on \(x\) lasting \(p(n)\) steps uses at most \(p(n)\) tape cells. Let \(Q\), \(\Gamma\), and \(\{0,\ldots,p(n)\}\) denote the state set, tape alphabet, and time steps. Introduce variables:

  • \(\text{state}[t][q]\): machine is in state \(q\) at time \(t\).
  • \(\text{head}[t][i]\): head is at cell \(i\) at time \(t\).
  • \(\text{cell}[t][i][a]\): cell \(i\) contains symbol \(a\) at time \(t\).

There are \(O(p(n)^2)\) variables in total (since \(|Q|\) and \(|\Gamma|\) are constants depending only on \(M\)).

Clauses

Add polynomial-size groups of clauses asserting:

Consistency. At each time \(t\): exactly one state holds, the head occupies exactly one cell, and each cell holds exactly one symbol. (Encode “exactly one” as one clause asserting at least one and \(\binom{k}{2}\) clauses asserting at most one, for constant \(k\).)

Initial configuration. At time \(0\): the machine is in its start state, the head is at cell \(0\), and the tape contains \(x\) followed by blanks.

Valid transitions. For each time \(t < p(n)\) and each triple \((q, i, a)\) of state, head position, and tape symbol, add clauses stating: if \(\text{state}[t][q] \land \text{head}[t][i] \land \text{cell}[t][i][a]\) holds, then the configuration at time \(t+1\) is consistent with some transition of \(M\) from \((q, a)\). Cells not under the head are unchanged. Since transitions are local (they depend only on the current state and the symbol under the head), each time step contributes \(O(p(n))\) clauses.

Acceptance. At least one time \(t \leq p(n)\) has the machine in an accepting state: \[\bigvee_{t=0}^{p(n)} \text{state}[t][q_{\text{accept}}].\]

Correctness

  • Completeness. If \(M\) accepts \(x\) via some non-deterministic computation, set the variables to encode that computation. All clause groups are satisfied by construction.
  • Soundness. Any satisfying assignment encodes a sequence of configurations consistent with \(M\)’s transitions and starting from the correct initial configuration, ending in an accepting state. This is a valid accepting computation of \(M\) on \(x\).

Size and Constructibility

The formula has \(O(p(n)^2)\) variables and \(O(p(n)^2)\) clauses, so its size is polynomial in \(n\). It can be constructed in polynomial time given \(x\) and the description of \(M\).

Conclusion

Since \(L\) was an arbitrary NP problem, every NP problem reduces to SAT in polynomial time. Combined with SAT \(\in\) NP, this establishes that SAT is NP-complete.

References

Cook, Stephen A. 1971. “The Complexity of Theorem-Proving Procedures.” Proceedings of the Third Annual ACM Symposium on Theory of Computing - STOC ’71, STOC ’71, 151–58. https://doi.org/10.1145/800157.805047.