LQR Is Solved by the Backward Riccati Recursion

Claim

Consider the finite-horizon LQR problem with linear dynamics \(x_{t+1} = Ax_t + Bu_t\) and cost

\[ J = x_T^\top Q_f\, x_T + \sum_{t=0}^{T-1} \bigl( x_t^\top Q\, x_t + u_t^\top R\, u_t \bigr), \]

where \(Q, Q_f \succeq 0\) and \(R \succ 0\). The optimal cost-to-go satisfies \(V_t(x) = x^\top P_t x\), where \(P_t \succeq 0\) satisfies the backward Riccati recursion:

\[ P_T = Q_f, \qquad P_t = Q + A^\top P_{t+1} A - A^\top P_{t+1} B \left( R + B^\top P_{t+1} B \right)^{-1} B^\top P_{t+1} A, \]

and the optimal control is \(u_t^* = -K_t x_t\) with \(K_t = (R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A\).

Proof

We proceed by backward induction on \(t\).

Base case (\(t = T\)). The terminal cost is \(V_T(x) = x^\top Q_f x = x^\top P_T x\) with \(P_T = Q_f \succeq 0\).

Inductive step. Suppose \(V_{t+1}(x) = x^\top P_{t+1} x\) with \(P_{t+1} \succeq 0\). The Bellman equation gives

\[ V_t(x) = \min_u \bigl[ x^\top Q\, x + u^\top R\, u + (Ax + Bu)^\top P_{t+1} (Ax + Bu) \bigr]. \]

Expanding the last term:

\[ V_t(x) = \min_u \Bigl[ x^\top Q x + u^\top R u + x^\top A^\top P_{t+1} A x + 2x^\top A^\top P_{t+1} B u + u^\top B^\top P_{t+1} B u \Bigr]. \]

The objective is strictly convex in \(u\) because \(R + B^\top P_{t+1} B \succ 0\) (since \(R \succ 0\)). Setting the gradient with respect to \(u\) to zero:

\[ 2Ru + 2B^\top P_{t+1} B u + 2B^\top P_{t+1} A x = 0 \implies u^* = -\underbrace{(R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A}_{K_t} x. \]

Substituting \(u^* = -K_t x\) into the objective and writing \(M_t = A - BK_t\):

\[ V_t(x) = x^\top \bigl[ Q + K_t^\top R K_t + M_t^\top P_{t+1} M_t \bigr] x. \]

Simplification to Riccati form. Expand \(M_t^\top P_{t+1} M_t\) and \(K_t^\top R K_t\) using \(K_t = (R + B^\top P_{t+1} B)^{-1} B^\top P_{t+1} A\). Let \(S_t = R + B^\top P_{t+1} B\) for brevity. Then

\[ K_t^\top R K_t + M_t^\top P_{t+1} M_t = A^\top P_{t+1} B S_t^{-1} R S_t^{-\top} B^\top P_{t+1} A + (A - BK_t)^\top P_{t+1} (A - BK_t). \]

Collecting all terms involving \(B^\top P_{t+1} A\) and using \(S_t^{-1} R + S_t^{-1} B^\top P_{t+1} B = I\) (since \(S_t = R + B^\top P_{t+1} B\)), the cross terms telescope and the expression simplifies to

\[ K_t^\top R K_t + M_t^\top P_{t+1} M_t = A^\top P_{t+1} A - A^\top P_{t+1} B S_t^{-1} B^\top P_{t+1} A. \]

Therefore

\[ P_t = Q + A^\top P_{t+1} A - A^\top P_{t+1} B \left( R + B^\top P_{t+1} B \right)^{-1} B^\top P_{t+1} A, \]

which is the stated Riccati recursion. Since \(P_{t+1} \succeq 0\) and \(Q \succeq 0\), the Schur complement structure implies \(P_t \succeq 0\), preserving the inductive hypothesis.

By induction, \(V_t(x) = x^\top P_t x\) for all \(t \in \{0, \ldots, T\}\) and the optimal control is \(u_t^* = -K_t x_t\).

Remarks

  • The simplification step is equivalent to completing the square in \(u\), or applying the matrix inversion lemma.
  • For the infinite-horizon problem, iterating the recursion from any \(P_0 \succeq 0\) converges to the unique positive semidefinite solution of the algebraic Riccati equation under stabilizability and detectability (Anderson and Moore 1990).

References

Anderson, Brian D. O., and John B. Moore. 1990. Optimal Control: Linear Quadratic Methods. Prentice Hall.