Skip to content

Commit 2ca3637

Browse files
Tom's March 26 edits of svd lecture
1 parent 586bd36 commit 2ca3637

File tree

2 files changed

+89
-259
lines changed

2 files changed

+89
-259
lines changed

lectures/SVD_reform.md

Lines changed: 0 additions & 254 deletions
This file was deleted.

lectures/svd_intro.md

Lines changed: 89 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -595,10 +595,12 @@ $$ (eq:Xvector)
595595
596596
and where $ T $ again denotes complex transposition and $ X_{i,t} $ is an observation on variable $ i $ at time $ t $.
597597
598-
We want to fit equation {eq}`eq:VARfirstorder` in a situation in which we have a number $n$ of observations that is small relative to the number $m$ of
599-
variables that appear in the vector $X_t$.
600598
601-
In particular, our data takes the form of an $ m \times n $ matrix o $ \tilde X $
599+
600+
We want to fit equation {eq}`eq:VARfirstorder`.
601+
602+
603+
Our data is assembled in the form of an $ m \times n $ matrix $ \tilde X $
602604
603605
$$
604606
\tilde X = \begin{bmatrix} X_1 \mid X_2 \mid \cdots \mid X_n\end{bmatrix}
@@ -609,6 +611,8 @@ where for $ t = 1, \ldots, n $, the $ m \times 1 $ vector $ X_t $ is given by {
609611
We want to estimate system {eq}`eq:VARfirstorder` consisting of $ m $ least squares regressions of **everything** on one lagged value of **everything**.
610612
611613
614+
615+
612616
We proceed as follows.
613617
614618
@@ -630,9 +634,73 @@ In forming $ X $ and $ X' $, we have in each case dropped a column from $ \tild
630634
631635
Evidently, $ X $ and $ X' $ are both $ m \times \tilde n $ matrices where $ \tilde n = n - 1 $.
632636
633-
We denote the rank of $ X $ as $ p \leq \min(m, \tilde n) = \tilde n $.
637+
We denote the rank of $ X $ as $ p \leq \min(m, \tilde n) $.
638+
639+
Two possible cases are when
640+
641+
* $ \tilde n > > m$, so that we have many more time series observations $\tilde n$ than variables $m$
642+
* $m > > \tilde n$, so that we have many more variables $m $ than time series observations $\tilde n$
643+
644+
At a general level that includes both of these special cases, a common formula describes the least squares estimator $\hat A$ of $A$ for both cases, but important details differ.
645+
646+
The common formula is
647+
648+
$$ \hat A = X' X^+ $$
649+
650+
where $X^+$ is the pseudo-inverse of $X$.
651+
652+
Formulas for the pseudo-inverse differ for our two cases.
634653

635-
As our estimator $\hat A$ of $A$ we form an $m \times m$ matrix that solves the least-squares best-fit problem
654+
When $ \tilde n > > m$, so that we have many more time series observations $\tilde n$ than variables $m$ and when
655+
$X$ has linearly independent **rows**, $X X^T$ has an inverse and the pseudo-inverse $X^+$ is
656+
657+
$$
658+
X^+ = X^T (X X^T)^{-1}
659+
$$
660+
661+
Here $X^+$ is a **right-inverse** that verifies $ X X^+ = I_{m \times m}$.
662+
663+
In this case, our formula for the least-squares estimator of $A$ becomes
664+
665+
$$
666+
\hat A = X' X^T (X X^T)^{-1}
667+
$$
668+
669+
This is formula is widely used in economics to estimate vector autorgressions.
670+
671+
The left side is proportional to the empirical cross second moment matrix of $X_{t+1}$ and $X_t$ times the inverse
672+
of the second moment matrix of $X_t$, the least-squares formula widely used in econometrics.
673+
674+
675+
676+
When $m > > \tilde n$, so that we have many more variables $m $ than time series observations $\tilde n$ and when $X$ has linearly independent **columns**,
677+
$X^T X$ has an inverse and the pseudo-inverse $X^+$ is
678+
679+
$$
680+
X^+ = (X^T X)^{-1} X^T
681+
$$
682+
683+
Here $X^+$ is a **left-inverse** that verifies $X^+ X = I_{\tilde n \times \tilde n}$.
684+
685+
In this case, our formula for a least-squares estimator of $A$ becomes
686+
687+
$$
688+
\hat A = X' (X^T X)^{-1} X^T
689+
$$ (eq:hatAversion0)
690+
691+
This is the case that we are interested in here.
692+
693+
694+
Thus, we want to fit equation {eq}`eq:VARfirstorder` in a situation in which we have a number $n$ of observations that is small relative to the number $m$ of
695+
variables that appear in the vector $X_t$.
696+
697+
We'll use efficient algorithms for computing and for constructing reduced rank approximations of $\hat A$ in formula {eq}`eq:hatAversion0`.
698+
699+
700+
701+
702+
703+
To reiterate and supply more detail about how we can efficiently calculate the pseudo-inverse $X^+$, as our estimator $\hat A$ of $A$ we form an $m \times m$ matrix that solves the least-squares best-fit problem
636704
637705
$$
638706
\hat A = \textrm{argmin}_{\check A} || X' - \check A X ||_F
@@ -706,6 +774,8 @@ $$
706774
X_t - U \tilde b_t
707775
$$ (eq:Xdecoder)
708776
777+
(Here we use $b$ to remind us that we are creating a **basis** vector.)
778+
709779
Since $U U^T$ is an $m \times m$ identity matrix, it follows from equation {eq}`eq:tildeXdef2` that we can reconstruct $X_t$ from $\tilde b_t$ by using
710780
711781
@@ -801,6 +871,20 @@ $$
801871
802872
803873
874+
## Using Fewer Modes
875+
876+
The preceding formulas assume that we have retained all $p$ modes associated with the positive
877+
singular values of $X$.
878+
879+
We can easily adapt all of the formulas to describe a situation in which we instead retain only
880+
the $r < p$ largest singular values.
881+
882+
In that case, we simply replace $\Sigma$ with the appropriate $r \times r$ matrix of singular values,
883+
$U$ with the $m \times r$ matrix of whose columns correspond to the $r$ largest singular values,
884+
and $V$ with the $\tilde n \times r$ matrix whose columns correspond to the $r$ largest singular values.
885+
886+
Counterparts of all of the salient formulas above then apply.
887+
804888
805889
806890
## Source for Some Python Code

0 commit comments

Comments
 (0)