You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/svd_intro.md
+77-17Lines changed: 77 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -842,7 +842,7 @@ Thus, our estimator $\hat A = X' X^+$ of the $m \times m$ matrix of coefficient
842
842
843
843
$$
844
844
\hat A = X' V \Sigma^{-1} U^T
845
-
$$
845
+
$$ (eq:AhatSVDformula)
846
846
847
847
In addition to doing that, we’ll eventually use **dynamic mode decomposition** to compute a rank $ r $ approximation to $ \hat A $,
848
848
where $ r < p $.
@@ -977,7 +977,7 @@ $$
977
977
\hat b_{t+1} = \Lambda \hat b_t
978
978
$$
979
979
980
-
where now our endoder is
980
+
where now our encoder is
981
981
982
982
$$
983
983
\hat b_t = W^{-1} U^T X_t
@@ -1158,14 +1158,71 @@ $$
1158
1158
where
1159
1159
1160
1160
$$
1161
-
\begin{aligned}
1162
-
\check b_t & = \Phi^+ X_t \cr
1163
-
X_t & = \Phi \check b_t
1164
-
\end{aligned}
1161
+
\check b_t = \Phi^+ X_t
1162
+
$$ (eq:decoder102)
1163
+
1164
+
1165
+
Here $\check b_t$ is a $p \times 1$ vector of regression coefficients, being component of $\check b$
1166
+
corresponding to column $t$ of the $p \times n$ matrix of regression coefficients
1167
+
1165
1168
$$
1169
+
\check b = \Phi^{\dagger} X .
1170
+
$$ (eq:decoder103)
1171
+
1172
+
Furthermore, $\check X_t$ is the $m\times 1$ vector of decoded or projected values of $X_t$ corresponding
1173
+
to column $t$ of the $m \times n$ matrix $X$.
1174
+
1175
+
Since $\Phi$ has $p$ linearly independent columns, the generalized inverse of $\Phi$ is
1166
1176
1177
+
$$
1178
+
\Phi^{\dagger} = (\Phi^T \Phi)^{-1} \Phi^T
1179
+
$$
1167
1180
1168
-
But there is a better way to compute the $p \times 1$ vector $\check b_t$
1181
+
and so
1182
+
1183
+
$$
1184
+
\check b = (\Phi^T \Phi)^{-1} \Phi^T X
1185
+
$$ (eq:checkbform)
1186
+
1187
+
Here $\check b$ can be recognized as a matrix of least squares regression coefficients of the matrix
1188
+
$X$ on the matrix $\Phi$ and $\Phi \check b$ is the least squares projection of $X$ on $\Phi$.
1189
+
1190
+
1191
+
1192
+
In more detail, by virtue of least-squares projection theory discussed here <https://python-advanced.quantecon.org/orth_proj.html>,
1193
+
we can represent $X$ as the sum of the projection $\check X$ of $X$ on $\Phi$
1194
+
1195
+
$$
1196
+
\check X_t = \Phi \check b_t
1197
+
$$
1198
+
1199
+
The least squares projection $\check X$ is related to $X$ by
1200
+
1201
+
1202
+
$$
1203
+
X = \Phi \check b + \epsilon
1204
+
$$
1205
+
1206
+
where $\epsilon$ is an $m \times n$ matrix of least squares errors satisfying the least squares
1207
+
orthogonality conditions $\epsilon^T \Phi =0 $ or
1208
+
1209
+
$$
1210
+
(X - \Phi \check b)^T \Phi = 0_{m \times p}
1211
+
$$ (eq:orthls)
1212
+
1213
+
Rearranging the orthogonality conditions {eq}`eq:orthls` gives $X^T \Phi = \check b \Phi^T \Phi$
1214
+
which implies formula {eq}`eq:checkbform`.
1215
+
1216
+
1217
+
1218
+
1219
+
1220
+
### Alternative algorithm
1221
+
1222
+
1223
+
1224
+
There is a better way to compute the $p \times 1$ vector $\check b_t$ than provided by formula
1225
+
{eq}`eq:decoder102`.
1169
1226
1170
1227
In particular, the following argument from {cite}`DDSE_book` (page 240) provides a computationally efficient way
1171
1228
to compute $\check b_t$.
@@ -1184,7 +1241,7 @@ where $\check b_1$ is an $r \times 1$ vector.
1184
1241
1185
1242
Recall from representation 1 above that $X_1 = U \tilde b_1$, where $\tilde b_1$ is the time $1$ basis vector for representation 1.
1186
1243
1187
-
It then follows that
1244
+
It then follows from equation {eq}`eq:Phiformula` that
1188
1245
1189
1246
$$
1190
1247
U \tilde b_1 = X' V \Sigma^{-1} W \check b_1
@@ -1196,7 +1253,9 @@ $$
1196
1253
\tilde b_1 = U^T X' V \Sigma^{-1} W \check b_1
1197
1254
$$
1198
1255
1199
-
Since $ \tilde A = U^T X' V \Sigma^{-1}$, it follows that
1256
+
Recall that from equation {eq}`eq:AhatSVDformula`, $ \tilde A = U^T X' V \Sigma^{-1}$.
1257
+
1258
+
It then follows that
1200
1259
1201
1260
$$
1202
1261
\tilde b_1 = \tilde A W \check b_1
@@ -1208,7 +1267,7 @@ $$
1208
1267
\tilde b_1 = W \Lambda \check b_1
1209
1268
$$
1210
1269
1211
-
Consesquently,
1270
+
Consequently,
1212
1271
1213
1272
$$
1214
1273
\check b_1 = ( W \Lambda)^{-1} \tilde b_1
@@ -1218,34 +1277,35 @@ or
1218
1277
1219
1278
1220
1279
$$
1221
-
\check b_1 = ( W \Lambda)^{-1} U^T X_1
1280
+
\check b_1 = ( W \Lambda)^{-1} U^T X_1 ,
1222
1281
$$ (eq:beqnsmall)
1223
1282
1224
1283
1225
1284
1226
-
which is computationally more efficient than the following instance of our earlier equation for computing the initial vector $\check b_1$:
1285
+
which is computationally more efficient than the following instance of equation {eq}`eq:decoder102` for computing the initial vector $\check b_1$:
1227
1286
1228
1287
$$
1229
1288
\check b_1= \Phi^{+} X_1
1230
1289
$$ (eq:bphieqn)
1231
1290
1232
1291
1233
-
Components of the basis vector $\check b_t = \Phi^+ X_t \equiv (W \Lambda)^{-1} U^T X_t$ are often called **exact** DMD nodes.
1292
+
The literature on DMD sometimes labels components of the basis vector $\check b_t = \Phi^+ X_t \equiv (W \Lambda)^{-1} U^T X_t$ as **exact** DMD nodes.
1234
1293
1235
-
Conditional on $X_t$, we can construct forecasts $\overline X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
1294
+
Conditional on $X_t$, we can compute our decoded $\check X_{t+j}, j = 1, 2, \ldots $ from
1236
1295
either
1237
1296
1238
1297
$$
1239
-
\overline X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
1298
+
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
1240
1299
$$ (eq:checkXevoln)
1241
1300
1242
1301
1243
1302
or
1244
1303
1245
1304
$$
1246
-
\overline X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} U^T X_t
1305
+
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} U^T X_t .
1247
1306
$$ (eq:checkXevoln2)
1248
1307
1308
+
We can then use $\check X_{t+j}$ to forcast $X_{t+j}$.
1249
1309
1250
1310
1251
1311
@@ -1254,7 +1314,7 @@ $$ (eq:checkXevoln2)
1254
1314
Some of the preceding formulas assume that we have retained all $p$ modes associated with the positive
1255
1315
singular values of $X$.
1256
1316
1257
-
We can easily adapt all of the formulas to describe a situation in which we instead retain only
1317
+
We can adjust our formulas to describe a situation in which we instead retain only
1258
1318
the $r < p$ largest singular values.
1259
1319
1260
1320
In that case, we simply replace $\Sigma$ with the appropriate $r \times r$ matrix of singular values,
0 commit comments