You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/svd_intro.md
+82-57Lines changed: 82 additions & 57 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -560,7 +560,7 @@ def compare_pca_svd(da):
560
560
561
561
## Dynamic Mode Decomposition (DMD)
562
562
563
-
We now turn to the case in which $m >>n$ in which an $m \times n$ data matrix $\tilde X$ contains many more random variables $m$ than observations $n$.
563
+
We turn to the case in which $m >>n$ in which an $m \times n$ data matrix $\tilde X$ contains many more random variables $m$ than observations $n$.
564
564
565
565
This is the **tall and skinny** case associated with **Dynamic Mode Decomposition**.
566
566
@@ -597,60 +597,80 @@ In forming $ X$ and $X'$, we have in each case dropped a column from $\tilde X$
597
597
598
598
Evidently, $ X$ and $ X'$ are both $m \times \tilde n$ matrices where $\tilde n = n - 1$.
599
599
600
-
We now let the rank of $X$ be $p \neq \min(m, \tilde n) = \tilde n$.
600
+
We denote the rank of $X$ as $p \neq \min(m, \tilde n) = \tilde n$.
601
601
602
602
We start with a system consisting of $m$ least squares regressions of **everything** on one lagged value of **everything**:
603
603
604
604
$$
605
605
X' = A X + \epsilon
606
-
$$
606
+
$$
607
607
608
608
where
609
609
610
610
$$
611
611
A = X' X^{+}
612
-
$$
612
+
$$ (eq:Afullformula)
613
613
614
614
and where the (possibly huge) $m \times m $ matrix $X^{+}$ is the Moore-Penrose generalized inverse of $X$.
615
615
616
-
The $i$ the row of $A$ is an $m \times 1$ vector of regression coefficients of $X_{i,t+1}$ on $X_{j,t}, j = 1, \ldots, m$.
616
+
The $i$th the row of $A$ is an $m \times 1$ vector of regression coefficients of $X_{i,t+1}$ on $X_{j,t}, j = 1, \ldots, m$.
617
617
618
618
619
-
Think about the (reduced) singular value decomposition
619
+
Consider the (reduced) singular value decomposition
620
620
621
621
$$
622
622
X = U \Sigma V^T
623
623
$$
624
+
625
+
624
626
625
627
where $U$ is $m \times p$, $\Sigma$ is a $p \times p$ diagonal matrix, and $ V^T$ is a $p \times \tilde n$ matrix.
626
628
627
629
Here $p$ is the rank of $X$, where necessarily $p \leq \tilde n$.
628
-
630
+
631
+
(We have described and illustrated a reduced singular value decomposition above, and compared it with a full singular value decomposition.)
629
632
630
633
We could construct the generalized inverse $X^+$ of $X$ by using
631
634
a singular value decomposition $X = U \Sigma V^T$ to compute
632
635
633
636
$$
634
637
X^{+} = V \Sigma^{-1} U^T
635
-
$$
638
+
$$ (eq:Xpinverse)
636
639
637
640
where the matrix $\Sigma^{-1}$ is constructed by replacing each non-zero element of $ \Sigma$ with $\sigma_j^{-1}$.
638
641
639
-
The idea behind **dynamic mode decomposition** is to construct an approximation that
642
+
We could use formula {eq}`eq:Xpinverse` together with formula {eq}`eq:Afullformula` to compute the matrix $A$ of regression coefficients.
643
+
644
+
Instead of doing that, we'll use **dynamic mode decomposition** to compute a rank $r$ approximation to $A$,
645
+
where $r < < p$.
646
+
647
+
648
+
The idea behind **dynamic mode decomposition** is to construct this low rank approximation to $A$ that
640
649
641
650
* sidesteps computing the generalized inverse $X^{+}$
642
651
643
-
* constructs an $m \times r$ matrix $\Phi$ that captures effects on all $m$ variables of $r < < p$ **modes** that are associated with the $r$ largest singular values
652
+
* constructs an $m \times r$ matrix $\Phi$ that captures effects on all $m$ variables of $r < < p$ **modes** that are associated with the $r$ largest eigenvalues of $A$
644
653
645
654
646
-
* uses $\Phi$ and powers of $r$ singular values to forecast *future* $X_t$'s
655
+
* uses $\Phi$ and powers of the $r$ largest eigenvalues of $A$ to forecast *future* $X_t$'s
647
656
648
-
The beauty of **dynamic mode decomposition** is that we accomplish this without ever computing the regression coefficients $A = X' X^{+}$.
657
+
658
+
An important properities of the DMD algorithm that we shall describe soon is that
659
+
660
+
* columns of the $m \times r$ matrix $\Phi$ are the eigenvectors of $A$ that correspond to the $r$ largest eigenvalues of $A$
661
+
* Tu et al. {cite}`tu_Rowley` verify these useful properties
662
+
663
+
664
+
665
+
An attractive feature of **dynamic mode decomposition** is that we avoid computing the huge matrix $A = X' X^{+}$ of regression coefficients, while under the right conditions, we acquire a good low-rank approximation of $A$ with low computational effort.
666
+
667
+
668
+
### Steps and Explanations
649
669
650
670
To construct a DMD, we deploy the following steps:
651
671
652
672
653
-
* As described above, though it would be costly, we could compute an $m \times m$ matrix $A$ by solving
673
+
* As mentioned above, though it would be costly, we could compute an $m \times m$ matrix $A$ by solving
654
674
655
675
$$
656
676
A = X' V \Sigma^{-1} U^T
@@ -660,10 +680,7 @@ To construct a DMD, we deploy the following steps:
660
680
661
681
But we won't do that.
662
682
663
-
We'll compute the $r$ largest singular values of $X$.
664
-
665
-
We'll form matrices $\tilde V, \tilde U$ corresponding to those $r$ singular values.
666
-
683
+
We'll compute the $r$ largest singular values of $X$ and form matrices $\tilde V, \tilde U$ corresponding to those $r$ singular values.
667
684
668
685
669
686
@@ -681,7 +698,8 @@ To construct a DMD, we deploy the following steps:
681
698
\tilde X_{t+1} = \tilde A \tilde X_t
682
699
$$
683
700
684
-
where an approximation $\check X_t$ to (i.e., a projection of) the original $m \times 1$ vector $X_t$ can be acquired from
701
+
where an approximation $\check X_t$ to the original $m \times 1$ vector $X_t$ can be acquired by projecting $X_t$ onto a subspace spanned by
702
+
the columns of $\tilde U$:
685
703
686
704
$$
687
705
\check X_t = \tilde U \tilde X_t
@@ -697,10 +715,6 @@ To construct a DMD, we deploy the following steps:
697
715
$$ (eq:tildeAform)
698
716
699
717
700
-
701
-
702
-
* Tu et al. {cite}`tu_Rowley` verify that eigenvalues and eigenvectors of $\tilde A$ equal the leading eigenvalues and associated eigenvectors of $A$.
703
-
704
718
* Construct an eigencomposition of $\tilde A$
705
719
706
720
$$
@@ -710,12 +724,14 @@ To construct a DMD, we deploy the following steps:
710
724
where $\Lambda$ is a $r \times r$ diagonal matrix of eigenvalues and the columns of $W$ are corresponding eigenvectors
711
725
of $\tilde A$. Both $\Lambda$ and $W$ are $r \times r$ matrices.
712
726
713
-
* Construct the $m \times r$ matrix
727
+
* A key step now is to construct the $m \times r$ matrix
714
728
715
729
$$
716
730
\Phi = X' \tilde V \tilde \Sigma^{-1} W
717
731
$$ (eq:Phiformula)
718
732
733
+
As asserted above, columns of $\Phi$ are the eigenvectors of $A$ corresponding to the largest eigenvalues of $A$.
734
+
719
735
720
736
721
737
We can construct an $r \times m$ matrix generalized inverse $\Phi^{+}$ of $\Phi$.
@@ -744,9 +760,46 @@ To construct a DMD, we deploy the following steps:
744
760
$$ (eq:bphieqn)
745
761
746
762
747
-
(Since it involves smaller matrices, formula {eq}`eq:beqnsmall` below is a computationally more efficient way to compute $b$)
748
763
749
-
* Then define _projected data_ $\tilde X_1$ by
764
+
765
+
766
+
### Putting Things Together
767
+
768
+
With $\Lambda, \Phi, \Phi^{+}$ in hand, our least-squares fitted dynamics fitted to the $r$ modes
769
+
are governed by
770
+
771
+
$$
772
+
X_{t+1}^{(r)} = \Phi \Lambda \Phi^{+} X_t^{(r)} .
773
+
$$ (eq:Xdynamicsapprox)
774
+
775
+
where $X_t^{(r)}$ is an $m \times 1$ vector.
776
+
777
+
By virtue of equation {eq}`eq:APhiLambda`, it follows that **if we had kept $r = p$**, this equation would be equivalent with
778
+
779
+
$$
780
+
X_{t+1} = A X_t .
781
+
$$ (eq:Xdynamicstrue)
782
+
783
+
When $r << p $, equation {eq}`eq:Xdynamicsapprox` is an approximation (of reduced order $r$) to the $X$ dynamics in equation
784
+
{eq}`eq:Xdynamicstrue`.
785
+
786
+
787
+
Conditional on $X_t$, we construct forecasts $\check X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
788
+
789
+
$$
790
+
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
791
+
$$ (eq:checkXevoln)
792
+
793
+
794
+
795
+
## Some Refinements
796
+
797
+
798
+
799
+
Because it involves smaller matrices, formula {eq}`eq:beqnsmall` below is a computationally more efficient way to compute $b$ than using equation {eq}`eq:bphieqn`.
800
+
801
+
802
+
Define a projection $\tilde X_1$ of $X_1$ onto the $r$ dominant modes by
750
803
751
804
$$
752
805
\tilde X_1 = \Phi b
@@ -784,39 +837,11 @@ To construct a DMD, we deploy the following steps:
784
837
785
838
which is computationally more efficient than equation {eq}`eq:bphieqn`.
786
839
840
+
* It follows that the following equation is equivalent with {eq}`eq:checkXevoln`
787
841
788
-
789
-
### Putting Things Together
790
-
791
-
With $\Lambda, \Phi, \Phi^{+}$ in hand, our least-squares fitted dynamics fitted to the $r$ modes
792
-
are governed by
793
-
794
-
$$
795
-
X_{t+1} = \Phi \Lambda \Phi^{+} X_t .
796
-
$$ (eq:Xdynamicsapprox)
797
-
798
-
But by virtue of equation {eq}`eq:APhiLambda`, it follows that **if we had kept $r = p$**, this equation would be equivalent with
799
-
800
-
$$
801
-
X_{t+1} = A X_t .
802
-
$$ (eq:Xdynamicstrue)
803
-
804
-
When $r << p $, equation {eq}`eq:Xdynamicsapprox` is an approximation (of reduced order $r$) to the $X$ dynamics in equation
805
-
{eq}`eq:Xdynamicstrue`.
806
-
807
-
808
-
Conditional on $X_t$, we construct forecasts $\check X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from
809
-
810
-
$$
811
-
\check X_{t+j} = \Phi \Lambda^j \Phi^{+} X_t
812
-
$$
813
-
814
-
or
815
-
816
-
$$
817
-
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} \tilde X_t
818
-
$$
819
-
842
+
$$
843
+
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} \tilde X_t
0 commit comments