LQR Is Solved by the Backward Riccati Recursion
Claim
Consider the finite-horizon LQR problem with linear dynamics \(x_{t+1} = Ax_t + Bu_t\) and cost
\[ J = x_T^\top Q_f\, x_T + \sum_{t=0}^{T-1} \bigl( x_t^\top Q\, x_t + u_t^\top R\, u_t \bigr), \]
where \(Q, Q_f \succeq 0\) and \(R \succ 0\). The optimal cost-to-go satisfies \(V_t(x) = x^\top P_t x\), where \(P_t \succeq 0\) satisfies the backward Riccati recursion:
\[ P_T = Q_f, \qquad P_t = Q + A^\top P_{t+1} A - A^\top P_{t+1} B \left( R + B^\top P_{t+1} B \right)^{-1} B^\top P_{t+1} A, \]
and the optimal control is \(u_t^* = -K_t x_t\) with \(K_t = (R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A\).
Proof
We proceed by backward induction on \(t\).
Base case (\(t = T\)). The terminal cost is \(V_T(x) = x^\top Q_f x = x^\top P_T x\) with \(P_T = Q_f \succeq 0\).
Inductive step. Suppose \(V_{t+1}(x) = x^\top P_{t+1} x\) with \(P_{t+1} \succeq 0\). The Bellman equation gives
\[ V_t(x) = \min_u \bigl[ x^\top Q\, x + u^\top R\, u + (Ax + Bu)^\top P_{t+1} (Ax + Bu) \bigr]. \]
Expanding the last term:
\[ V_t(x) = \min_u \Bigl[ x^\top Q x + u^\top R u + x^\top A^\top P_{t+1} A x + 2x^\top A^\top P_{t+1} B u + u^\top B^\top P_{t+1} B u \Bigr]. \]
The objective is strictly convex in \(u\) because \(R + B^\top P_{t+1} B \succ 0\) (since \(R \succ 0\)). Setting the gradient with respect to \(u\) to zero:
\[ 2Ru + 2B^\top P_{t+1} B u + 2B^\top P_{t+1} A x = 0 \implies u^* = -\underbrace{(R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A}_{K_t} x. \]
Substituting \(u^* = -K_t x\) into the objective and writing \(M_t = A - BK_t\):
\[ V_t(x) = x^\top \bigl[ Q + K_t^\top R K_t + M_t^\top P_{t+1} M_t \bigr] x. \]
Simplification to Riccati form. Expand \(M_t^\top P_{t+1} M_t\) and \(K_t^\top R K_t\) using \(K_t = (R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A\). Let \(S_t = R + B^\top P_{t+1} B\) for brevity. Then
\[ K_t^\top R K_t + M_t^\top P_{t+1} M_t = A^\top P_{t+1} B S_t^{-1} R S_t^{-\top} B^\top P_{t+1} A + (A - BK_t)^\top P_{t+1} (A - BK_t). \]
Collecting all terms involving \(B^\top P_{t+1} A\) and using \(S_t^{-1} R + S_t^{-1} B^\top P_{t+1} B = I\) (since \(S_t = R + B^\top P_{t+1} B\)), the cross terms telescope and the expression simplifies to
\[ K_t^\top R K_t + M_t^\top P_{t+1} M_t = A^\top P_{t+1} A - A^\top P_{t+1} B S_t^{-1} B^\top P_{t+1} A. \]
Therefore
\[ P_t = Q + A^\top P_{t+1} A - A^\top P_{t+1} B \left( R + B^\top P_{t+1} B \right)^{-1} B^\top P_{t+1} A, \]
which is the stated Riccati recursion. Since \(P_{t+1} \succeq 0\) and \(Q \succeq 0\), the Schur complement structure implies \(P_t \succeq 0\), preserving the inductive hypothesis.
By induction, \(V_t(x) = x^\top P_t x\) for all \(t \in \{0, \ldots, T\}\) and the optimal control is \(u_t^* = -K_t x_t\).
Remarks
- The simplification step is equivalent to completing the square in \(u\), or applying the matrix inversion lemma.
- For the infinite-horizon problem, iterating the recursion from any \(P_0 \succeq 0\) converges to the unique positive semidefinite solution of the algebraic Riccati equation under stabilizability and detectability (Anderson and Moore 1990).