Skip to content

Commit c442e66

Browse files
Tom's June 11 edits of DMD lecture
1 parent 0d1df55 commit c442e66

File tree

1 file changed

+77
-17
lines changed

1 file changed

+77
-17
lines changed

lectures/svd_intro.md

Lines changed: 77 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -842,7 +842,7 @@ Thus, our estimator $\hat A = X' X^+$ of the $m \times m$ matrix of coefficient
842842
843843
$$
844844
\hat A = X' V \Sigma^{-1} U^T
845-
$$
845+
$$ (eq:AhatSVDformula)
846846
847847
In addition to doing that, we’ll eventually use **dynamic mode decomposition** to compute a rank $ r $ approximation to $ \hat A $,
848848
where $ r < p $.
@@ -977,7 +977,7 @@ $$
977977
\hat b_{t+1} = \Lambda \hat b_t
978978
$$
979979
980-
where now our endoder is
980+
where now our encoder is
981981
982982
$$
983983
\hat b_t = W^{-1} U^T X_t
@@ -1158,14 +1158,71 @@ $$
11581158
where
11591159
11601160
$$
1161-
\begin{aligned}
1162-
\check b_t & = \Phi^+ X_t \cr
1163-
X_t & = \Phi \check b_t
1164-
\end{aligned}
1161+
\check b_t = \Phi^+ X_t
1162+
$$ (eq:decoder102)
1163+
1164+
1165+
Here $\check b_t$ is a $p \times 1$ vector of regression coefficients, being component of $\check b$
1166+
corresponding to column $t$ of the $p \times n$ matrix of regression coefficients
1167+
11651168
$$
1169+
\check b = \Phi^{\dagger} X .
1170+
$$ (eq:decoder103)
1171+
1172+
Furthermore, $\check X_t$ is the $m\times 1$ vector of decoded or projected values of $X_t$ corresponding
1173+
to column $t$ of the $m \times n$ matrix $X$.
1174+
1175+
Since $\Phi$ has $p$ linearly independent columns, the generalized inverse of $\Phi$ is
11661176
1177+
$$
1178+
\Phi^{\dagger} = (\Phi^T \Phi)^{-1} \Phi^T
1179+
$$
11671180
1168-
But there is a better way to compute the $p \times 1$ vector $\check b_t$
1181+
and so
1182+
1183+
$$
1184+
\check b = (\Phi^T \Phi)^{-1} \Phi^T X
1185+
$$ (eq:checkbform)
1186+
1187+
Here $\check b$ can be recognized as a matrix of least squares regression coefficients of the matrix
1188+
$X$ on the matrix $\Phi$ and $\Phi \check b$ is the least squares projection of $X$ on $\Phi$.
1189+
1190+
1191+
1192+
In more detail, by virtue of least-squares projection theory discussed here <https://python-advanced.quantecon.org/orth_proj.html>,
1193+
we can represent $X$ as the sum of the projection $\check X$ of $X$ on $\Phi$
1194+
1195+
$$
1196+
\check X_t = \Phi \check b_t
1197+
$$
1198+
1199+
The least squares projection $\check X$ is related to $X$ by
1200+
1201+
1202+
$$
1203+
X = \Phi \check b + \epsilon
1204+
$$
1205+
1206+
where $\epsilon$ is an $m \times n$ matrix of least squares errors satisfying the least squares
1207+
orthogonality conditions $\epsilon^T \Phi =0 $ or
1208+
1209+
$$
1210+
(X - \Phi \check b)^T \Phi = 0_{m \times p}
1211+
$$ (eq:orthls)
1212+
1213+
Rearranging the orthogonality conditions {eq}`eq:orthls` gives $X^T \Phi = \check b \Phi^T \Phi$
1214+
which implies formula {eq}`eq:checkbform`.
1215+
1216+
1217+
1218+
1219+
1220+
### Alternative algorithm
1221+
1222+
1223+
1224+
There is a better way to compute the $p \times 1$ vector $\check b_t$ than provided by formula
1225+
{eq}`eq:decoder102`.
11691226
11701227
In particular, the following argument from {cite}`DDSE_book` (page 240) provides a computationally efficient way
11711228
to compute $\check b_t$.
@@ -1184,7 +1241,7 @@ where $\check b_1$ is an $r \times 1$ vector.
11841241
11851242
Recall from representation 1 above that $X_1 = U \tilde b_1$, where $\tilde b_1$ is the time $1$ basis vector for representation 1.
11861243
1187-
It then follows that
1244+
It then follows from equation {eq}`eq:Phiformula` that
11881245
11891246
$$
11901247
U \tilde b_1 = X' V \Sigma^{-1} W \check b_1
@@ -1196,7 +1253,9 @@ $$
11961253
\tilde b_1 = U^T X' V \Sigma^{-1} W \check b_1
11971254
$$
11981255
1199-
Since $ \tilde A = U^T X' V \Sigma^{-1}$, it follows that
1256+
Recall that from equation {eq}`eq:AhatSVDformula`, $ \tilde A = U^T X' V \Sigma^{-1}$.
1257+
1258+
It then follows that
12001259
12011260
$$
12021261
\tilde b_1 = \tilde A W \check b_1
@@ -1208,7 +1267,7 @@ $$
12081267
\tilde b_1 = W \Lambda \check b_1
12091268
$$
12101269
1211-
Consesquently,
1270+
Consequently,
12121271
12131272
$$
12141273
\check b_1 = ( W \Lambda)^{-1} \tilde b_1
@@ -1218,34 +1277,35 @@ or
12181277
12191278
12201279
$$
1221-
\check b_1 = ( W \Lambda)^{-1} U^T X_1
1280+
\check b_1 = ( W \Lambda)^{-1} U^T X_1 ,
12221281
$$ (eq:beqnsmall)
12231282
12241283
12251284
1226-
which is computationally more efficient than the following instance of our earlier equation for computing the initial vector $\check b_1$:
1285+
which is computationally more efficient than the following instance of equation {eq}`eq:decoder102` for computing the initial vector $\check b_1$:
12271286
12281287
$$
12291288
\check b_1= \Phi^{+} X_1
12301289
$$ (eq:bphieqn)
12311290
12321291
1233-
Components of the basis vector $\check b_t = \Phi^+ X_t \equiv (W \Lambda)^{-1} U^T X_t$ are often called **exact** DMD nodes.
1292+
The literature on DMD sometimes labels components of the basis vector $\check b_t = \Phi^+ X_t \equiv (W \Lambda)^{-1} U^T X_t$ as **exact** DMD nodes.
12341293
1235-
Conditional on $X_t$, we can construct forecasts $\overline X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
1294+
Conditional on $X_t$, we can compute our decoded $\check X_{t+j}, j = 1, 2, \ldots $ from
12361295
either
12371296
12381297
$$
1239-
\overline X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
1298+
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
12401299
$$ (eq:checkXevoln)
12411300
12421301
12431302
or
12441303
12451304
$$
1246-
\overline X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} U^T X_t
1305+
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} U^T X_t .
12471306
$$ (eq:checkXevoln2)
12481307
1308+
We can then use $\check X_{t+j}$ to forcast $X_{t+j}$.
12491309
12501310
12511311
@@ -1254,7 +1314,7 @@ $$ (eq:checkXevoln2)
12541314
Some of the preceding formulas assume that we have retained all $p$ modes associated with the positive
12551315
singular values of $X$.
12561316
1257-
We can easily adapt all of the formulas to describe a situation in which we instead retain only
1317+
We can adjust our formulas to describe a situation in which we instead retain only
12581318
the $r < p$ largest singular values.
12591319
12601320
In that case, we simply replace $\Sigma$ with the appropriate $r \times r$ matrix of singular values,

0 commit comments

Comments
 (0)