Skip to content

Commit 4f31434

Browse files
Tom's March 16 edits of svd lecture
1 parent 924938a commit 4f31434

File tree

1 file changed

+102
-102
lines changed

1 file changed

+102
-102
lines changed

lectures/svd_intro.md

Lines changed: 102 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -628,7 +628,7 @@ $$
628628
A = X' X^{+} .
629629
$$ (eq:Afullformula)
630630
631-
Here the (possibly huge) $m \times m $ matrix $X^{+}$ is the Moore-Penrose generalized inverse of $X$.
631+
Here the (possibly huge) $\tilde n \times m $ matrix $X^{+}$ is the Moore-Penrose generalized inverse of $X$.
632632
633633
The $i$th the row of $A$ is an $m \times 1$ vector of regression coefficients of $X_{i,t+1}$ on $X_{j,t}, j = 1, \ldots, m$.
634634
@@ -641,7 +641,7 @@ Consider the (reduced) singular value decomposition
641641
642642
643643
644-
where $U$ is $m \times p$, $\Sigma$ is a $p \times p$ diagonal matrix, and $ V^T$ is a $p \times m$ matrix.
644+
where $U$ is $m \times p$, $\Sigma$ is a $p \times p$ diagonal matrix, and $ V^T$ is a $p \times \tilde n$ matrix.
645645
646646
Here $p$ is the rank of $X$, where necessarily $p \leq \tilde n$.
647647
@@ -673,32 +673,45 @@ The idea behind **dynamic mode decomposition** is to construct this low rank ap
673673
674674
675675
676-
## Preliminary Analysis
676+
## Analysis
677677
678678
We'll put basic ideas on the table by starting with the special case in which $r = p$ so that we retain
679679
all $p$ singular values of $X$.
680680
681681
(Later, we'll retain only $r < p$ of them)
682682
683683
When $r = p$, formula
684-
{eq}`eq:Xpinverse` implies that
684+
{eq}`eq:Xpinverse` for $X^+$ implies that
685685
686686
687687
$$
688688
A = X' V \Sigma^{-1} U^T
689689
$$ (eq:Aformbig)
690690
691-
where $V$ is an $\tilde n \times p$ matrix, $\Sigma^{-1}$ is a $p \times p$ matrix, $U$ is a $p \times m$ matrix,
691+
where $V$ is an $\tilde n \times p$ matrix, $\Sigma^{-1}$ is a $p \times p$ matrix, $U^T$ is a $p \times m$ matrix,
692692
and $U^T U = I_p$ and $V V^T = I_m $.
693693
694+
695+
It is convenient to represent $A$ as computed in equation {eq}`eq:Aformbig` as
696+
697+
$$
698+
A = U \tilde A U^T
699+
$$ (eq:Afactortilde)
700+
701+
where the $p \times p$ transition matrix $\tilde A$ can be recovered from
702+
703+
$$
704+
\tilde A = U^T A U = U^T X' V \Sigma^{-1} .
705+
$$ (eq:Atilde0)
706+
694707
We use the $p$ columns of $U$, and thus the $p$ rows of $U^T$, to define a $p \times 1$ vector $\tilde X_t$ as follows
695708
696709
697710
$$
698711
\tilde X_t = U^T X_t .
699712
$$ (eq:tildeXdef2)
700713
701-
Since $U U^T$ is an $m \times m$ identity matrix, it follows from equation {eq}`eq:tildeXdef2` that we can recover $X_t$ from $\tilde X_t$ by using
714+
Since $U U^T$ is an $m \times m$ identity matrix, it follows from equation {eq}`eq:tildeXdef2` that we can reconstruct $X_t$ from $\tilde X_t$ by using
702715
703716
$$
704717
X_t = U \tilde X_t .
@@ -709,13 +722,9 @@ $$ (eq:Xdecoder)
709722
710723
* Equation {eq}`eq:Xdecoder` serves as a **decoder** that recovers the $m \times 1$ vector $X_t$ from the $p \times 1$ vector $\tilde X_t$
711724
712-
The following $p \times p$ transition matrix governs the motion of $\tilde X_t$:
713725
714-
$$
715-
\tilde A = U^T A U = U^T X' V \Sigma^{-1} .
716-
$$ (eq:Atilde0)
717726
718-
Evidently,
727+
Because $U^T U = I_p$, we have
719728
720729
$$
721730
\tilde X_{t+1} = \tilde A \tilde X_t
@@ -725,7 +734,7 @@ Notice that if we multiply both sides of {eq}`eq:xtildemotion` by $U$
725734
we get
726735
727736
$$
728-
U \tilde X_t = U \tilde A \tilde X_t = U \tilde A U^T X_t
737+
U \tilde X_{t+1} = U \tilde A \tilde X_t = U \tilde A U^T X_t
729738
$$
730739
731740
which by virtue of decoder equation {eq}`eq:xtildemotion` recovers
@@ -738,53 +747,9 @@ $$
738747
739748
740749
741-
### Lower Rank Approximations
742-
743-
744-
Instead of using all $p$ modes $\tilde X_t$ calculated according to formula {eq}`eq:tildeXdef2`, we can use just the $r<p$ largest of them.
745-
746-
These are the ones that are most important in shaping
747-
the dynamics of $X$.
748-
749-
We can accomplish this by computing the $r$ largest singular values of $X$ and forming matrices $\tilde V, \tilde U$ corresponding to those $r$ singular values.
750-
751-
We can then construct a reduced-order system of dimension $r$ by forming an $r \times r$ transition matrix
752-
$\tilde A$ redefined by
753-
754-
$$
755-
\tilde A = \tilde U^T A \tilde U
756-
$$ (eq:tildeA_1)
757-
758-
Here we now use $\tilde U$ rather than $U$ as we did earlier in equation {eq}`eq:Atilde0`.
759-
760-
This redefined $\tilde A$ matrix governs the dynamics of a redefined $r \times 1$ vector $\tilde X_t $
761-
according to
762-
763-
$$
764-
\tilde X_{t+1} = \tilde A \tilde X_t
765-
$$
766-
767-
where now
768-
769-
$$
770-
\tilde X_t = \tilde U^T X_t
771-
$$
772-
773-
and
774-
775-
$$
776-
X_t = \tilde U \tilde X_t.
777-
$$
778-
779-
From equation {eq}`eq:tildeA_1` and {eq}`eq:Aformbig` it follows that
780-
781-
782-
$$
783-
\tilde A = \tilde U^T X' \tilde V \Sigma^{-1}
784-
$$ (eq:tildeAform)
785-
786750
787-
Next, we'll construct an eigencomposition of $\tilde A$:
751+
It is useful to construct an eigencomposition of the $p \times p$ transition matrix $\tilde A$ defined
752+
in equation in {eq}`eq:Atilde0` above:
788753
789754
$$
790755
\tilde A W = W \Lambda
@@ -793,40 +758,40 @@ $$ (eq:tildeAeigen)
793758
where $\Lambda$ is a $r \times r$ diagonal matrix of eigenvalues and the columns of $W$ are corresponding eigenvectors
794759
of $\tilde A$.
795760
796-
Both $\Lambda$ and $W$ are $r \times r$ matrices.
761+
Both $\Lambda$ and $W$ are $p \times p$ matrices.
797762
798-
Construct the $m \times r$ matrix
763+
Construct the $m \times p$ matrix
799764
800765
$$
801-
\Phi = X' \tilde V \tilde \Sigma^{-1} W
766+
\Phi = X' V \Sigma^{-1} W
802767
$$ (eq:Phiformula)
803768
804769
805770
806-
The following very useful proposition was established by Tu et al. {cite}`tu_Rowley`.
771+
Tu et al. {cite}`tu_Rowley` established the following
807772
808773
**Proposition** The $r$ columns of $\Phi$ are eigenvectors of $A$ that correspond to the largest $r$ eigenvalues of $A$.
809774
810775
**Proof:** From formula {eq}`eq:Phiformula` we have
811776
812777
$$
813778
\begin{aligned}
814-
A \Phi & = (X' \tilde V \tilde \Sigma^{-1} \tilde U^T) (X' \tilde V \tilde \Sigma^{-1} W) \cr
815-
& = X' \tilde V \Sigma^{-1} \tilde A W \cr
816-
& = X' \tilde V \tilde \Sigma^{-1} W \Lambda \cr
779+
A \Phi & = (X' V \Sigma^{-1} U^T) (X' V \Sigma^{-1} W) \cr
780+
& = X' V \Sigma^{-1} \tilde A W \cr
781+
& = X' V \Sigma^{-1} W \Lambda \cr
817782
& = \Phi \Lambda
818783
\end{aligned}
819784
$$
820785
821-
Thus, we can conclude that
786+
Thus, we have deduced that
822787
823788
$$
824789
A \Phi = \Phi \Lambda
825790
$$ (eq:APhiLambda)
826791
827792
Let $\phi_i$ be the the $i$the column of $\Phi$ and $\lambda_i$ be the corresponding $i$ eigenvalue of $\tilde A$ from decomposition {eq}`eq:tildeAeigen`.
828793
829-
Writing out the $m \times r$ vectors on both sides of equation {eq}`eq:APhiLambda` and equating them gives
794+
Writing out the $m \times p$ vectors on both sides of equation {eq}`eq:APhiLambda` and equating them gives
830795
831796
832797
$$
@@ -841,72 +806,115 @@ This concludes the proof.
841806
Also see {cite}`DDSE_book` (p. 238)
842807
843808
809+
### Two Representations of $A$
810+
811+
We have constructed two representations of (or approximations to) $A$.
812+
813+
One from equation {eq}`eq:Afactortilde` is
814+
815+
$$
816+
A = U \tilde A U^T
817+
$$ (eq:Aform11)
818+
819+
while from equation the eigen decomposition {eq}`eq:APhiLambda` the other is
820+
821+
$$
822+
A = \Phi \Lambda \Phi^+
823+
$$ (eq:Aform12)
824+
825+
826+
From formula {eq}`eq:Aform11` we can deduce
827+
828+
$$
829+
\tilde X_{t+1} = \tilde A \tilde X_t
830+
$$
831+
832+
where
833+
834+
$$
835+
\begin{aligned}
836+
\tilde X_t & = U^T X_t \cr
837+
X_t & = U \tilde X_t
838+
\end{aligned}
839+
$$
840+
844841
842+
From formula {eq}`eq:Aform12` we can deduce
845843
844+
$$
845+
b_{t+1} = \Lambda b_t
846+
$$
847+
848+
where
846849
850+
$$
851+
\begin{aligned}
852+
b_t & = \Phi^+ X_t \cr
853+
X_t & = \Phi b_t
854+
\end{aligned}
855+
$$
847856
848857
849-
## Some Refinements
858+
There is better formula for the $p \times 1$ vector $b_t$
850859
851-
The following argument from {cite}`DDSE_book` (page 240) provides a computationally efficient way
852-
to compute projections of the time $t$ data onto $r$ dominant **modes** at time $t$.
860+
In particular, the following argument from {cite}`DDSE_book` (page 240) provides a computationally efficient way
861+
to compute $b_t$.
853862
854863
For convenience, we'll do this first for time $t=1$.
855864
856865
857866
858-
Define a projection of $X_1$ onto $r$ dominant **modes** $b$ at time $1$ by
867+
For $t=1$, we have
859868
860869
$$
861-
X_1 = \Phi b
870+
X_1 = \Phi b_1
862871
$$ (eq:X1proj)
863872
864-
where $b$ is an $r \times 1$ vector.
873+
where $b_1$ is a $p \times 1$ vector.
865874
866-
Since $X_1 = \tilde U \tilde X_1$, it follows that
875+
Since $X_1 = U \tilde X_1$, it follows that
867876
868877
$$
869-
\tilde U \tilde X_1 = X' \tilde V \tilde \Sigma^{-1} W b
878+
U \tilde X_1 = X' V \Sigma^{-1} W b_1
870879
$$
871880
872881
and
873882
874883
$$
875-
\tilde X_1 = \tilde U^T X' \tilde V \tilde \Sigma^{-1} W b
884+
\tilde X_1 = U^T X' V \Sigma^{-1} W b_1
876885
$$
877886
878-
Recall from formula {eq}`eq:tildeAform` that $ \tilde A = \tilde U^T X' \tilde V \tilde \Sigma^{-1}$ so that
887+
Recall that $ \tilde A = U^T X' V \Sigma^{-1}$ so that
879888
880889
$$
881-
\tilde X_1 = \tilde A W b
890+
\tilde X_1 = \tilde A W b_1
882891
$$
883892
884893
and therefore, by the eigendecomposition {eq}`eq:tildeAeigen` of $\tilde A$, we have
885894
886895
$$
887-
\tilde X_1 = W \Lambda b
896+
\tilde X_1 = W \Lambda b_1
888897
$$
889898
890899
Therefore,
891900
892901
$$
893-
b = ( W \Lambda)^{-1} \tilde X_1
902+
b_1 = ( W \Lambda)^{-1} \tilde X_1
894903
$$
895904
896905
or
897906
898907
899908
$$
900-
b = ( W \Lambda)^{-1} \tilde U^T X_1
909+
b_1 = ( W \Lambda)^{-1} U^T X_1
901910
$$ (eq:beqnsmall)
902911
903912
904913
905-
which is computationally more efficient than the following alternative equation for computing the initial vector $b$ of $r$ dominant
906-
modes:
914+
which is computationally more efficient than the following instance of our earlier equation for computing the initial vector $b_1$:
907915
908916
$$
909-
b= \Phi^{+} X_1
917+
b_1= \Phi^{+} X_1
910918
$$ (eq:bphieqn)
911919
912920
@@ -921,33 +929,25 @@ $$ (eq:checkXevoln)
921929
or the following equation
922930
923931
$$
924-
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} \tilde U^T X_t
932+
\check X_{t+j} = \Phi \Lambda^j (W \Lambda)^{-1} U^T X_t
925933
$$ (eq:checkXevoln2)
926934
927935
928936
929-
### Putting Things Together
930-
931-
With $\Lambda, \Phi, \Phi^{+}$ in hand, our least-squares fitted dynamics fitted to the $r$ modes
932-
are governed by
937+
### Using Fewer Modes
933938
934-
$$
935-
X_{t+1}^{(r)} = \Phi \Lambda \Phi^{+} X_t^{(r)}
936-
$$ (eq:Xdynamicsapprox)
939+
The preceding formulas assume that we have retained all $p$ modes associated with the positive
940+
singular values of $X$.
937941
938-
where $X_t^{(r)}$ is an $m \times 1$ vector.
942+
We can easily adapt all of the formulas to describe a situation in which we instead retain only
943+
the $r < p$ largest singular values.
939944
940-
By virtue of equation {eq}`eq:APhiLambda`, it follows that **if we had kept $r = p$**, this equation would be equivalent with
945+
In that case, we simply replace $\Sigma$ with the appropriate $r \times r$ matrix of singular values,
946+
$U$ with the $m \times r$ matrix of whose columns correspond to the $r$ largest singular values,
947+
and $V$ with the $\tilde n \times r$ matrix whose columns correspond to the $r$ largest singular values.
941948
942-
$$
943-
X_{t+1} = A X_t .
944-
$$ (eq:Xdynamicstrue)
949+
Counterparts of all of the salient formulas above then apply.
945950
946-
When $r < p $, equation {eq}`eq:Xdynamicsapprox` is an approximation (of reduced order $r$) to the $X$ dynamics in equation
947-
{eq}`eq:Xdynamicstrue`.
948-
949-
950-
Conditional on $X_t$, we construct forecasts $\check X_{t+j} $ of $X_{t+j}, j = 1, 2, \ldots, $ from {eq}`eq:checkXevoln`.
951951
952952
953953

0 commit comments

Comments
 (0)