Skip to content

Commit 91e4030

Browse files
Tom's March 5 edits of svd lecture
1 parent 8ee128e commit 91e4030

File tree

1 file changed

+82
-57
lines changed

1 file changed

+82
-57
lines changed

lectures/svd_intro.md

Lines changed: 82 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -560,7 +560,7 @@ def compare_pca_svd(da):
560560
561561
## Dynamic Mode Decomposition (DMD)
562562
563-
We now turn to the case in which $m >>n$ in which an $m \times n$ data matrix $\tilde X$ contains many more random variables $m$ than observations $n$.
563+
We turn to the case in which $m >>n$ in which an $m \times n$ data matrix $\tilde X$ contains many more random variables $m$ than observations $n$.
564564
565565
This is the **tall and skinny** case associated with **Dynamic Mode Decomposition**.
566566
@@ -597,60 +597,80 @@ In forming $ X$ and $X'$, we have in each case dropped a column from $\tilde X$
597597

598598
Evidently, $ X$ and $ X'$ are both $m \times \tilde n$ matrices where $\tilde n = n - 1$.
599599

600-
We now let the rank of $X$ be $p \neq \min(m, \tilde n) = \tilde n$.
600+
We denote the rank of $X$ as $p \neq \min(m, \tilde n) = \tilde n$.
601601

602602
We start with a system consisting of $m$ least squares regressions of **everything** on one lagged value of **everything**:
603603

604604
$$
605605
X' = A X + \epsilon
606-
$$
606+
$$
607607

608608
where
609609

610610
$$
611611
A = X' X^{+}
612-
$$
612+
$$ (eq:Afullformula)
613613
614614
and where the (possibly huge) $m \times m $ matrix $X^{+}$ is the Moore-Penrose generalized inverse of $X$.
615615
616-
The $i$ the row of $A$ is an $m \times 1$ vector of regression coefficients of $X_{i,t+1}$ on $X_{j,t}, j = 1, \ldots, m$.
616+
The $i$th the row of $A$ is an $m \times 1$ vector of regression coefficients of $X_{i,t+1}$ on $X_{j,t}, j = 1, \ldots, m$.
617617
618618
619-
Think about the (reduced) singular value decomposition
619+
Consider the (reduced) singular value decomposition
620620
621621
$$
622622
X = U \Sigma V^T
623623
$$
624+
625+
624626
625627
where $U$ is $m \times p$, $\Sigma$ is a $p \times p$ diagonal matrix, and $ V^T$ is a $p \times \tilde n$ matrix.
626628
627629
Here $p$ is the rank of $X$, where necessarily $p \leq \tilde n$.
628-
630+
631+
(We have described and illustrated a reduced singular value decomposition above, and compared it with a full singular value decomposition.)
629632
630633
We could construct the generalized inverse $X^+$ of $X$ by using
631634
a singular value decomposition $X = U \Sigma V^T$ to compute
632635
633636
$$
634637
X^{+} = V \Sigma^{-1} U^T
635-
$$
638+
$$ (eq:Xpinverse)
636639
637640
where the matrix $\Sigma^{-1}$ is constructed by replacing each non-zero element of $ \Sigma$ with $\sigma_j^{-1}$.
638641
639-
The idea behind **dynamic mode decomposition** is to construct an approximation that
642+
We could use formula {eq}`eq:Xpinverse` together with formula {eq}`eq:Afullformula` to compute the matrix $A$ of regression coefficients.
643+
644+
Instead of doing that, we'll use **dynamic mode decomposition** to compute a rank $r$ approximation to $A$,
645+
where $r < < p$.
646+
647+
648+
The idea behind **dynamic mode decomposition** is to construct this low rank approximation to $A$ that
640649
641650
* sidesteps computing the generalized inverse $X^{+}$
642651
643-
* constructs an $m \times r$ matrix $\Phi$ that captures effects on all $m$ variables of $r < < p$ **modes** that are associated with the $r$ largest singular values
652+
* constructs an $m \times r$ matrix $\Phi$ that captures effects on all $m$ variables of $r < < p$ **modes** that are associated with the $r$ largest eigenvalues of $A$
644653
645654
646-
* uses $\Phi$ and powers of $r$ singular values to forecast *future* $X_t$'s
655+
* uses $\Phi$ and powers of the $r$ largest eigenvalues of $A$ to forecast *future* $X_t$'s
647656
648-
The beauty of **dynamic mode decomposition** is that we accomplish this without ever computing the regression coefficients $A = X' X^{+}$.
657+
658+
An important properities of the DMD algorithm that we shall describe soon is that
659+
660+
* columns of the $m \times r$ matrix $\Phi$ are the eigenvectors of $A$ that correspond to the $r$ largest eigenvalues of $A$
661+
* Tu et al. {cite}`tu_Rowley` verify these useful properties
662+
663+
664+
665+
An attractive feature of **dynamic mode decomposition** is that we avoid computing the huge matrix $A = X' X^{+}$ of regression coefficients, while under the right conditions, we acquire a good low-rank approximation of $A$ with low computational effort.
666+
667+
668+
### Steps and Explanations
649669
650670
To construct a DMD, we deploy the following steps:
651671
652672
653-
* As described above, though it would be costly, we could compute an $m \times m$ matrix $A$ by solving
673+
* As mentioned above, though it would be costly, we could compute an $m \times m$ matrix $A$ by solving
654674
655675
$$
656676
A = X' V \Sigma^{-1} U^T
@@ -660,10 +680,7 @@ To construct a DMD, we deploy the following steps:
660680
661681
But we won't do that.
662682
663-
We'll compute the $r$ largest singular values of $X$.
664-
665-
We'll form matrices $\tilde V, \tilde U$ corresponding to those $r$ singular values.
666-
683+
We'll compute the $r$ largest singular values of $X$ and form matrices $\tilde V, \tilde U$ corresponding to those $r$ singular values.
667684
668685
669686
@@ -681,7 +698,8 @@ To construct a DMD, we deploy the following steps:
681698
\tilde X_{t+1} = \tilde A \tilde X_t
682699
$$
683700
684-
where an approximation $\check X_t$ to (i.e., a projection of) the original $m \times 1$ vector $X_t$ can be acquired from
701+
where an approximation $\check X_t$ to the original $m \times 1$ vector $X_t$ can be acquired by projecting $X_t$ onto a subspace spanned by
702+
the columns of $\tilde U$:
685703
686704
$$
687705
\check X_t = \tilde U \tilde X_t
@@ -697,10 +715,6 @@ To construct a DMD, we deploy the following steps:
697715
$$ (eq:tildeAform)
698716
699717
700-
701-
702-
* Tu et al. {cite}`tu_Rowley` verify that eigenvalues and eigenvectors of $\tilde A$ equal the leading eigenvalues and associated eigenvectors of $A$.
703-
704718
* Construct an eigencomposition of $\tilde A$
705719
706720
$$
@@ -710,12 +724,14 @@ To construct a DMD, we deploy the following steps:
710724
where $\Lambda$ is a $r \times r$ diagonal matrix of eigenvalues and the columns of $W$ are corresponding eigenvectors
711725
of $\tilde A$. Both $\Lambda$ and $W$ are $r \times r$ matrices.
712726
713-
* Construct the $m \times r$ matrix
727+
* A key step now is to construct the $m \times r$ matrix
714728
715729
$$
716730
\Phi = X' \tilde V \tilde \Sigma^{-1} W
717731
$$ (eq:Phiformula)
718732
733+
As asserted above, columns of $\Phi$ are the eigenvectors of $A$ corresponding to the largest eigenvalues of $A$.
734+
719735
720736
721737
We can construct an $r \times m$ matrix generalized inverse $\Phi^{+}$ of $\Phi$.
@@ -744,9 +760,46 @@ To construct a DMD, we deploy the following steps:
744760
$$ (eq:bphieqn)
745761
746762
747-
(Since it involves smaller matrices, formula {eq}`eq:beqnsmall` below is a computationally more efficient way to compute $b$)
748763
749-
* Then define _projected data_ $\tilde X_1$ by
764+
765+
766+
### Putting Things Together
767+
768+
With $\Lambda, \Phi, \Phi^{+}$ in hand, our least-squares fitted dynamics fitted to the $r$ modes
769+
are governed by
770+
771+
$$
772+
X_{t+1}^{(r)} = \Phi \Lambda \Phi^{+} X_t^{(r)} .
773+
$$ (eq:Xdynamicsapprox)
774+
775+
where $X_t^{(r)}$ is an $m \times 1$ vector.
776+
777+
By virtue of equation {eq}`eq:APhiLambda`, it follows that **if we had kept $r = p$**, this equation would be equivalent with
778+
779+
$$
780+
X_{t+1} = A X_t .
781+
$$ (eq:Xdynamicstrue)
782+
783+
When $r << p $, equation {eq}`eq:Xdynamicsapprox` is an approximation (of reduced order $r$) to the $X$ dynamics in equation
784+
{eq}`eq:Xdynamicstrue`.
785+
786+
787+
Conditional on $X_t$, we construct forecasts $\check X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
788+
789+
$$
790+
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
791+
$$ (eq:checkXevoln)
792+
793+
794+
795+
## Some Refinements
796+
797+
798+
799+
Because it involves smaller matrices, formula {eq}`eq:beqnsmall` below is a computationally more efficient way to compute $b$ than using equation {eq}`eq:bphieqn`.
800+
801+
802+
Define a projection $\tilde X_1$ of $X_1$ onto the $r$ dominant modes by
750803
751804
$$
752805
\tilde X_1 = \Phi b
@@ -784,39 +837,11 @@ To construct a DMD, we deploy the following steps:
784837
785838
which is computationally more efficient than equation {eq}`eq:bphieqn`.
786839
840+
* It follows that the following equation is equivalent with {eq}`eq:checkXevoln`
787841
788-
789-
### Putting Things Together
790-
791-
With $\Lambda, \Phi, \Phi^{+}$ in hand, our least-squares fitted dynamics fitted to the $r$ modes
792-
are governed by
793-
794-
$$
795-
X_{t+1} = \Phi \Lambda \Phi^{+} X_t .
796-
$$ (eq:Xdynamicsapprox)
797-
798-
But by virtue of equation {eq}`eq:APhiLambda`, it follows that **if we had kept $r = p$**, this equation would be equivalent with
799-
800-
$$
801-
X_{t+1} = A X_t .
802-
$$ (eq:Xdynamicstrue)
803-
804-
When $r << p $, equation {eq}`eq:Xdynamicsapprox` is an approximation (of reduced order $r$) to the $X$ dynamics in equation
805-
{eq}`eq:Xdynamicstrue`.
806-
807-
808-
Conditional on $X_t$, we construct forecasts $\check X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
809-
810-
$$
811-
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
812-
$$
813-
814-
or
815-
816-
$$
817-
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} \tilde X_t
818-
$$
819-
842+
$$
843+
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} \tilde X_t
844+
$$ (eq:checkXevoln2)
820845
821846
822847

0 commit comments

Comments
 (0)