|
| 1 | +--- |
| 2 | +jupytext: |
| 3 | + text_representation: |
| 4 | + extension: .md |
| 5 | + format_name: myst |
| 6 | + format_version: 0.13 |
| 7 | + jupytext_version: 1.11.1 |
| 8 | +kernelspec: |
| 9 | + display_name: Python 3 |
| 10 | + language: python |
| 11 | + name: python3 |
| 12 | +--- |
| 13 | + |
| 14 | +# Eliminating Cross Products |
| 15 | + |
| 16 | +## Overview |
| 17 | + |
| 18 | +This lecture describes formulas for eliminating |
| 19 | + |
| 20 | + * cross products between states and control in linear-quadratic dynamic programming problems |
| 21 | + |
| 22 | + * covariances between state and measurement noises in Kalman filtering problems |
| 23 | + |
| 24 | + |
| 25 | +For a linear-quadratic dynamic programming problem, the idea involves these steps |
| 26 | + |
| 27 | + * transform states and controls in a way that leads to an equivalent problem with no cross-products between transformed states and controls |
| 28 | + * solve the transformed problem using standard formulas for problems with no cross-products between states and controls presented in this lecture {doc}`Linear Control: Foundations <lqcontrol>` |
| 29 | + * transform the optimal decision rule for the altered problem into the optimal decision rule for the original problem with cross-products between states and controls |
| 30 | + |
| 31 | ++++ |
| 32 | + |
| 33 | +## Undiscounted dynamic programming problem |
| 34 | + |
| 35 | +Here is a nonstochastic undiscounted LQ dynamic programming with cross products between |
| 36 | +states and controls in the objective function. |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | +The problem is defined by the 5-tuple of matrices $(A, B, R, Q, H)$ |
| 41 | +where $R$ and $Q$ are positive definite symmetric matrices and |
| 42 | +$A \sim m \times m, B \sim m \times k, Q \sim k \times k, R \sim m \times m$ and $H \sim k \times m$. |
| 43 | + |
| 44 | + |
| 45 | +The problem is to choose $\{x_{t+1}, u_t\}_{t=0}^\infty$ to maximize |
| 46 | + |
| 47 | +$$ |
| 48 | + - \sum_{t=0}^\infty (x_t' R x_t + u_t' Q u_t + 2 u_t H x_t) |
| 49 | +$$ |
| 50 | + |
| 51 | +subject to the linear constraints |
| 52 | + |
| 53 | +$$ x_{t+1} = A x_t + B u_t, \quad t \geq 0 $$ |
| 54 | + |
| 55 | +where $x_0$ is a given initial condition. |
| 56 | + |
| 57 | +The solution to this undiscounted infinite-horizon problem is a time-invariant feedback rule |
| 58 | + |
| 59 | +$$ u_t = -F x_t $$ |
| 60 | + |
| 61 | +where |
| 62 | + |
| 63 | +$$ F = -(Q + B'PB)^{-1} B'PA $$ |
| 64 | + |
| 65 | +and $P \sim m \times m $ is a positive definite solution of the algebraic matrix Riccati equation |
| 66 | + |
| 67 | +$$ |
| 68 | +P = R + A'PA - (A'PB + H')(Q + B'PB)^{-1}(B'PA + H). |
| 69 | +$$ |
| 70 | + |
| 71 | + |
| 72 | ++++ |
| 73 | + |
| 74 | +It can be verified that an **equivalent** problem without cross-products between states and controls |
| 75 | +is defined by a 4-tuple of matrices : $(A^*, B, R^*, Q) $. |
| 76 | + |
| 77 | +That the omitted matrix $H=0$ indicates that there are no cross products between states and controls |
| 78 | +in the equivalent problem. |
| 79 | + |
| 80 | +The matrices $(A^*, B, R^*, Q) $ defining the equivalent problem and the value function, policy function matrices $P, F^*$ that solve it are related to the matrices $(A, B, R, Q, H)$ defining the original problem and the value function, policy function matrices $P, F$ that solve the original problem by |
| 81 | + |
| 82 | +\begin{align*} |
| 83 | +A^* & = A - B Q^{-1} H, \\ |
| 84 | +R^* & = R - H'Q^{-1} H, \\ |
| 85 | +P & = R^* + {A^*}' P A - ({A^*}' P B) (Q + B' P B)^{-1} B' P A^*, \\ |
| 86 | +F^* & = (Q + B' P B)^{-1} B' P A^*, \\ |
| 87 | +F & = F^* + Q^{-1} H. |
| 88 | +\end{align*} |
| 89 | + |
| 90 | ++++ |
| 91 | + |
| 92 | +## Kalman filter |
| 93 | + |
| 94 | +The **duality** that prevails between a linear-quadratic optimal control and a Kalman filtering problem means that there is an analogous transformation that allows us to transform a Kalman filtering problem |
| 95 | +with non-zero covariance matrix between between shocks to states and shocks to measurements to an equivalent Kalman filtering problem with zero covariance between shocks to states and measurments. |
| 96 | + |
| 97 | +Let's look at the appropriate transformations. |
| 98 | + |
| 99 | + |
| 100 | +First, let's recall the Kalman filter with covariance between noises to states and measurements. |
| 101 | + |
| 102 | +The hidden Markov model is |
| 103 | + |
| 104 | +\begin{align*} |
| 105 | +x_{t+1} & = A x_t + B w_{t+1}, \\ |
| 106 | +z_{t+1} & = D x_t + F w_{t+1}, |
| 107 | +\end{align*} |
| 108 | + |
| 109 | +where $A \sim m \times m, B \sim m \times p $ and $D \sim k \times m, F \sim k \times p $, |
| 110 | +and $w_{t+1}$ is the time $t+1$ component of a sequence of i.i.d. $p \times 1$ normally distibuted |
| 111 | +random vectors with mean vector zero and covariance matrix equal to a $p \times p$ identity matrix. |
| 112 | + |
| 113 | +Thus, $x_t$ is $m \times 1$ and $z_t$ is $k \times 1$. |
| 114 | + |
| 115 | +The Kalman filtering formulas are |
| 116 | + |
| 117 | + |
| 118 | +\begin{align*} |
| 119 | +K(\Sigma_t) & = (A \Sigma_t D' + BF')(D \Sigma_t D' + FF')^{-1}, \\ |
| 120 | +\Sigma_{t+1}& = A \Sigma_t A' + BB' - (A \Sigma_t D' + BF')(D \Sigma_t D' + FF')^{-1} (D \Sigma_t A' + FB'). |
| 121 | +\end{align*} (eq:Kalman102) |
| 122 | + |
| 123 | + |
| 124 | +Define tranformed matrices |
| 125 | + |
| 126 | +\begin{align*} |
| 127 | +A^* & = A - BF' (FF')^{-1} D, \\ |
| 128 | +B^* {B^*}' & = BB' - BF' (FF')^{-1} FB'. |
| 129 | +\end{align*} |
| 130 | + |
| 131 | +### Algorithm |
| 132 | + |
| 133 | +A consequence of formulas {eq}`eq:Kalman102} is that we can use the following algorithm to solve Kalman filtering problems that involve non zero covariances between state and signal noises. |
| 134 | + |
| 135 | +First, compute $\Sigma, K^*$ using the ordinary Kalman filtering formula with $BF' = 0$, i.e., |
| 136 | +with zero covariance matrix between random shocks to states and random shocks to measurements. |
| 137 | + |
| 138 | +That is, compute $K^*$ and $\Sigma$ that satisfy |
| 139 | + |
| 140 | +\begin{align*} |
| 141 | +K^* & = (A^* \Sigma D')(D \Sigma D' + FF')^{-1} \\ |
| 142 | +\Sigma & = A^* \Sigma {A^*}' + B^* {B^*}' - (A^* \Sigma D')(D \Sigma D' + FF')^{-1} (D \Sigma {A^*}'). |
| 143 | +\end{align*} |
| 144 | + |
| 145 | +The Kalman gain for the original problem **with non-zero covariance** between shocks to states and measurements is then |
| 146 | + |
| 147 | +$$ |
| 148 | +K = K^* + BF' (FF')^{-1}, |
| 149 | +$$ |
| 150 | + |
| 151 | +The state reconstruction covariance matrix $\Sigma$ for the original problem equals the state reconstrution covariance matrix for the transformed problem. |
| 152 | + |
| 153 | ++++ |
| 154 | + |
| 155 | +## Duality table |
| 156 | + |
| 157 | +Here is a handy table to remember how the Kalman filter and dynamic program are related. |
| 158 | + |
| 159 | + |
| 160 | +| Dynamic Program | Kalman Filter | |
| 161 | +| :-------------: | :-----------: | |
| 162 | +| $A$ | $A'$ | |
| 163 | +| $B$ | $D'$ | |
| 164 | +| $H$ | $FB'$ | |
| 165 | +| $Q$ | $FF'$ | |
| 166 | +| $R$ | $BB'$ | |
| 167 | +| $F$ | $K'$ | |
| 168 | +| $P$ | $\Sigma$ | |
| 169 | + |
| 170 | ++++ |
| 171 | + |
| 172 | + |
| 173 | +```{code-cell} ipython3 |
| 174 | +
|
| 175 | +``` |
0 commit comments