Skip to content

Commit 8da0125

Browse files
authored
Merge pull request #202 from QuantEcon/improve-mc-eigen
Improve Markov Chain and Eigenvalues Lectures
2 parents 5b6a6ff + 59ab02f commit 8da0125

File tree

4 files changed

+59
-72
lines changed

4 files changed

+59
-72
lines changed

lectures/_toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ parts:
3535
- file: olg
3636
- file: markov_chains_I
3737
- file: markov_chains_II
38+
- file: eigen_II
3839
- file: commod_price
3940
- caption: Optimization
4041
numbered: true
@@ -46,7 +47,6 @@ parts:
4647
- caption: Modeling in Higher Dimensions
4748
numbered: true
4849
chapters:
49-
- file: eigen_II
5050
- file: input_output
5151
- file: lake_model
5252
- file: asset_pricing

lectures/eigen_II.md

Lines changed: 19 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,6 @@ kernelspec:
1111
name: python3
1212
---
1313

14-
15-
1614
# Spectral Theory
1715

1816
```{index} single: Spectral Theory
@@ -63,11 +61,11 @@ We denote this as $A \geq 0$.
6361
(irreducible)=
6462
### Irreducible matrices
6563

66-
We have (informally) introduced irreducible matrices in the [Markov chain lecture](markov_chains_II.md).
64+
We introduced irreducible matrices in the [Markov chain lecture](mc_irreducible).
6765

68-
Here we will introduce this concept formally.
66+
Here we generalize this concept:
6967

70-
$A$ is called **irreducible** if for *each* $(i,j)$ there is an integer $k \geq 0$ such that $a^{k}_{ij} > 0$.
68+
An $n \times n$ matrix $A$ is called irreducible if, for each $i,j$ with $1 \leq i, j \leq n$, there exists a $k \geq 0$ such that $a^{k}_{ij} > 0$.
7169

7270
A matrix $A$ that is not irreducible is called reducible.
7371

@@ -84,29 +82,25 @@ Here are some examples to illustrate this further.
8482

8583
Let $A$ be a square nonnegative matrix and let $A^k$ be the $k^{th}$ power of $A$.
8684

87-
A matrix is considered **primitive** if there exists a $k \in \mathbb{N}$ such that $A^k$ is everywhere positive.
85+
A matrix is called **primitive** if there exists a $k \in \mathbb{N}$ such that $A^k$ is everywhere positive.
8886

8987
It means that $A$ is called primitive if there is an integer $k \geq 0$ such that $a^{k}_{ij} > 0$ for *all* $(i,j)$.
9088

9189
We can see that if a matrix is primitive, then it implies the matrix is irreducible.
9290

93-
This is because if there exists an $A^k$ such that $a^{k}_{ij} > 0$ for all $(i,j)$, then it guarantees the same property for ${k+1}^th, {k+2}^th ... {k+n}^th$ iterations.
94-
95-
In other words, a primitive matrix is both irreducible and aperiodic as aperiodicity requires a state to be visited with a guarantee of returning to itself after a certain amount of iterations.
96-
9791
### Left eigenvectors
9892

99-
We have previously discussed right (ordinary) eigenvectors $Av = \lambda v$.
93+
We previously discussed right (ordinary) eigenvectors $Av = \lambda v$.
10094

10195
Here we introduce left eigenvectors.
10296

10397
Left eigenvectors will play important roles in what follows, including that of stochastic steady states for dynamic models under a Markov assumption.
10498

10599
We will talk more about this later, but for now, let's define left eigenvectors.
106100

107-
A vector $w$ is called a left eigenvector of $A$ if $w$ is an eigenvector of $A^T$.
101+
A vector $w$ is called a left eigenvector of $A$ if $w$ is an eigenvector of $A^\top$.
108102

109-
In other words, if $w$ is a left eigenvector of matrix A, then $A^T w = \lambda w$, where $\lambda$ is the eigenvalue associated with the left eigenvector $v$.
103+
In other words, if $w$ is a left eigenvector of matrix $A$, then $A^\top w = \lambda w$, where $\lambda$ is the eigenvalue associated with the left eigenvector $v$.
110104

111105
This hints at how to compute left eigenvectors
112106

@@ -147,17 +141,17 @@ print(w)
147141

148142
Note that the eigenvalues for both left and right eigenvectors are the same, but the eigenvectors themselves are different.
149143

150-
We can then take transpose to obtain $A^T w = \lambda w$ and obtain $w^T A= \lambda w^T$.
144+
We can then take transpose to obtain $A^\top w = \lambda w$ and obtain $w^\top A= \lambda w^\top$.
151145

152146
This is a more common expression and where the name left eigenvectors originates.
153147

154148
(perron-frobe)=
155149
### The Perron-Frobenius Theorem
156150

157-
For a nonnegative matrix $A$ the behavior of $A^k$ as $k \to \infty$ is controlled by the eigenvalue with the largest
151+
For a square nonnegative matrix $A$, the behavior of $A^k$ as $k \to \infty$ is controlled by the eigenvalue with the largest
158152
absolute value, often called the **dominant eigenvalue**.
159153

160-
For a matrix nonnegative square matrix $A$, the Perron-Frobenius Theorem characterizes certain
154+
For any such matrix $A$, the Perron-Frobenius Theorem characterizes certain
161155
properties of the dominant eigenvalue and its corresponding eigenvector.
162156

163157
```{prf:Theorem} Perron-Frobenius Theorem
@@ -178,9 +172,7 @@ If $A$ is primitive then,
178172
179173
6. the inequality $|\lambda| \leq r(A)$ is **strict** for all eigenvalues $\lambda$ of $A$ distinct from $r(A)$, and
180174
7. with $v$ and $w$ normalized so that the inner product of $w$ and $v = 1$, we have
181-
$ r(A)^{-m} A^m$ converges to $v w^{\top}$ when $m \rightarrow \infty$.
182-
\
183-
the matrix $v w^{\top}$ is called the **Perron projection** of $A$.
175+
$ r(A)^{-m} A^m$ converges to $v w^{\top}$ when $m \rightarrow \infty$. The matrix $v w^{\top}$ is called the **Perron projection** of $A$
184176
```
185177

186178
(This is a relatively simple version of the theorem --- for more details see
@@ -194,7 +186,7 @@ Now let's consider examples for each case.
194186

195187
#### Example 1: irreducible matrix
196188

197-
Consider the following irreducible matrix A:
189+
Consider the following irreducible matrix $A$:
198190

199191
```{code-cell} ipython3
200192
A = np.array([[0, 1, 0],
@@ -208,7 +200,7 @@ We can compute the dominant eigenvalue and the corresponding eigenvector
208200
eig(A)
209201
```
210202

211-
Now we can go through our checklist to verify the claims of the Perron-Frobenius Theorem for the irreducible matrix A:
203+
Now we can go through our checklist to verify the claims of the Perron-Frobenius Theorem for the irreducible matrix $A$:
212204

213205
1. The dominant eigenvalue is real-valued and non-negative.
214206
2. All other eigenvalues have absolute values less than or equal to the dominant eigenvalue.
@@ -218,7 +210,7 @@ Now we can go through our checklist to verify the claims of the Perron-Frobenius
218210

219211
#### Example 2: primitive matrix
220212

221-
Consider the following primitive matrix B:
213+
Consider the following primitive matrix $B$:
222214

223215
```{code-cell} ipython3
224216
B = np.array([[0, 1, 1],
@@ -228,27 +220,13 @@ B = np.array([[0, 1, 1],
228220
np.linalg.matrix_power(B, 2)
229221
```
230222

231-
We can compute the dominant eigenvalue and the corresponding eigenvector using the power iteration method as discussed {ref}`earlier<eig1_ex1>`:
232-
233-
```{code-cell} ipython3
234-
num_iters = 20
235-
b = np.random.rand(B.shape[1])
236-
237-
for i in range(num_iters):
238-
b = B @ b
239-
b = b / np.linalg.norm(b)
240-
241-
dominant_eigenvalue = np.dot(B @ b, b) / np.dot(b, b)
242-
np.round(dominant_eigenvalue, 2)
243-
```
223+
We compute the dominant eigenvalue and the corresponding eigenvector
244224

245225
```{code-cell} ipython3
246226
eig(B)
247227
```
248228

249-
250-
251-
Now let's verify the claims of the Perron-Frobenius Theorem for the primitive matrix B:
229+
Now let's verify the claims of the Perron-Frobenius Theorem for the primitive matrix $B$:
252230

253231
1. The dominant eigenvalue is real-valued and non-negative.
254232
2. All other eigenvalues have absolute values strictly less than the dominant eigenvalue.
@@ -307,8 +285,8 @@ A1 = np.array([[1, 2],
307285
[1, 4]])
308286
309287
A2 = np.array([[0, 1, 1],
310-
[1, 0, 1],
311-
[1, 1, 0]])
288+
[1, 0, 1],
289+
[1, 1, 0]])
312290
313291
A3 = np.array([[0.971, 0.029, 0.1, 1],
314292
[0.145, 0.778, 0.077, 0.59],
@@ -353,7 +331,7 @@ In fact we have already seen the theorem in action before in {ref}`the markov ch
353331

354332
We are now prepared to bridge the languages spoken in the two lectures.
355333

356-
A primitive matrix is both irreducible (or strongly connected in the language of graph) and aperiodic.
334+
A primitive matrix is both irreducible (or strongly connected in the language of {ref}`graph theory<strongly_connected>` and aperiodic.
357335

358336
So Perron-Frobenius Theorem explains why both Imam and Temple matrix and Hamilton matrix converge to a stationary distribution, which is the Perron projection of the two matrices
359337

lectures/markov_chains_II.md

Lines changed: 38 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ from matplotlib import cm
6161
import matplotlib as mpl
6262
```
6363

64+
(mc_irreducible)=
6465
## Irreducibility
6566

6667

@@ -141,8 +142,6 @@ mc = qe.MarkovChain(P, ('poor', 'middle', 'rich'))
141142
mc.is_irreducible
142143
```
143144

144-
145-
146145
It might be clear to you already that irreducibility is going to be important
147146
in terms of long-run outcomes.
148147

@@ -246,34 +245,44 @@ Therefore, we can see the sample path averages for each state (the fraction of
246245
time spent in each state) converges to the stationary distribution regardless of
247246
the starting state
248247

248+
Let's denote the fraction of time spent in state $x$ at time $t$ in our sample path as $\hat p_t(x)$ where
249+
250+
$$
251+
\hat p_t(x) := \frac{1}{t} \sum_{t = 1}^t \mathbf{1}\{X_t = x\}
252+
$$
253+
254+
255+
Here we compare $\hat p_t(x)$ with the stationary distribution $\psi^* (x)$ for different starting points $x_0$.
256+
249257
```{code-cell} ipython3
250258
P = np.array([[0.971, 0.029, 0.000],
251259
[0.145, 0.778, 0.077],
252260
[0.000, 0.508, 0.492]])
253261
ts_length = 10_000
254262
mc = qe.MarkovChain(P)
255263
n = len(P)
256-
fig, axes = plt.subplots(nrows=1, ncols=n)
264+
fig, axes = plt.subplots(nrows=1, ncols=n, figsize=(15, 6))
257265
ψ_star = mc.stationary_distributions[0]
258266
plt.subplots_adjust(wspace=0.35)
259267
260268
for i in range(n):
261-
axes[i].grid()
262-
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color = 'black',
269+
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
263270
label = fr'$\psi^*({i})$')
264271
axes[i].set_xlabel('t')
265-
axes[i].set_ylabel(f'fraction of time spent at {i}')
272+
axes[i].set_ylabel(fr'$\hat p_t({i})$')
266273
267274
# Compute the fraction of time spent, starting from different x_0s
268275
for x0, col in ((0, 'blue'), (1, 'green'), (2, 'red')):
269276
# Generate time series that starts at different x0
270277
X = mc.simulate(ts_length, init=x0)
271-
X_bar = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
272-
axes[i].plot(X_bar, color=col, label=f'$x_0 = \, {x0} $')
278+
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
279+
axes[i].plot(p_hat, color=col, label=f'$x_0 = \, {x0} $')
273280
axes[i].legend()
274281
plt.show()
275282
```
276283

284+
Note the convergence to the stationary distribution regardless of the starting point $x_0$.
285+
277286
### Example 3
278287

279288
Let's look at one more example with six states {ref}`discussed before <mc_eg3>`.
@@ -295,8 +304,7 @@ $$
295304
The {ref}`graph <mc_eg3>` for the chain shows all states are reachable,
296305
indicating that this chain is irreducible.
297306

298-
Similar to previous examples, the sample path averages for each state converge
299-
to the stationary distribution.
307+
Here we visualize the difference between $\hat p_t(x)$ and the stationary distribution $\psi^* (x)$ for each state $x$
300308

301309
```{code-cell} ipython3
302310
P = [[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],
@@ -313,20 +321,23 @@ fig, ax = plt.subplots(figsize=(9, 6))
313321
X = mc.simulate(ts_length)
314322
# Center the plot at 0
315323
ax.set_ylim(-0.25, 0.25)
316-
ax.axhline(0, linestyle='dashed', lw=2, color = 'black', alpha=0.4)
324+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
317325
318326
319327
for x0 in range(6):
320328
# Calculate the fraction of time for each state
321-
X_bar = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
322-
ax.plot(X_bar - ψ_star[x0], label=f'$X = {x0+1} $')
329+
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
330+
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
323331
ax.set_xlabel('t')
324-
ax.set_ylabel(r'fraction of time spent in a state $- \psi^* (x)$')
332+
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')
325333
326334
ax.legend()
327335
plt.show()
328336
```
329337

338+
Similar to previous examples, the sample path averages for each state converge
339+
to the stationary distribution.
340+
330341
### Example 4
331342

332343
Let's look at another example with two states: 0 and 1.
@@ -364,19 +375,18 @@ fig, axes = plt.subplots(nrows=1, ncols=n)
364375
ψ_star = mc.stationary_distributions[0]
365376
366377
for i in range(n):
367-
axes[i].grid()
368378
axes[i].set_ylim(0.45, 0.55)
369-
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color = 'black',
379+
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
370380
label = fr'$\psi^*({i})$')
371381
axes[i].set_xlabel('t')
372-
axes[i].set_ylabel(f'fraction of time spent at {i}')
382+
axes[i].set_ylabel(fr'$\hat p_t({i})$')
373383
374384
# Compute the fraction of time spent, for each x
375385
for x0 in range(n):
376386
# Generate time series starting at different x_0
377387
X = mc.simulate(ts_length, init=x0)
378-
X_bar = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
379-
axes[i].plot(X_bar, label=f'$x_0 = \, {x0} $')
388+
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
389+
axes[i].plot(p_hat, label=f'$x_0 = \, {x0} $')
380390
381391
axes[i].legend()
382392
plt.show()
@@ -388,8 +398,6 @@ The proportion of time spent in a state can converge to the stationary distribut
388398

389399
However, the distribution at each state does not.
390400

391-
392-
393401
### Expectations of geometric sums
394402

395403
Sometimes we want to compute the mathematical expectation of a geometric sum, such as
@@ -505,14 +513,14 @@ mc = qe.MarkovChain(P)
505513
fig, ax = plt.subplots(figsize=(9, 6))
506514
X = mc.simulate(ts_length)
507515
ax.set_ylim(-0.25, 0.25)
508-
ax.axhline(0, linestyle='dashed', lw=2, color = 'black', alpha=0.4)
516+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
509517
510518
for x0 in range(8):
511519
# Calculate the fraction of time for each worker
512-
X_bar = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
513-
ax.plot(X_bar - ψ_star[x0], label=f'$X = {x0+1} $')
520+
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
521+
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
514522
ax.set_xlabel('t')
515-
ax.set_ylabel(r'fraction of time spent in a state $- \psi^* (x)$')
523+
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')
516524
517525
ax.legend()
518526
plt.show()
@@ -584,8 +592,7 @@ mc = qe.MarkovChain(P)
584592
585593
fig, ax = plt.subplots(figsize=(9, 6))
586594
ax.set_ylim(-0.25, 0.25)
587-
ax.grid()
588-
ax.hlines(0, 0, ts_length, lw=2, alpha=0.6) # Horizonal line at zero
595+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
589596
590597
for x0, col in ((0, 'blue'), (1, 'green')):
591598
# Generate time series for worker that starts at x0
@@ -594,10 +601,12 @@ for x0, col in ((0, 'blue'), (1, 'green')):
594601
X_bar = (X == 0).cumsum() / (1 + np.arange(ts_length, dtype=float))
595602
# Plot
596603
ax.fill_between(range(ts_length), np.zeros(ts_length), X_bar - p, color=col, alpha=0.1)
597-
ax.plot(X_bar - p, color=col, label=f'$X_0 = \, {x0} $')
604+
ax.plot(X_bar - p, color=col, label=f'$x_0 = \, {x0} $')
598605
# Overlay in black--make lines clearer
599606
ax.plot(X_bar - p, 'k-', alpha=0.6)
600-
607+
ax.set_xlabel('t')
608+
ax.set_ylabel(r'$\bar X_m - \psi^* (x)$')
609+
601610
ax.legend(loc='upper right')
602611
plt.show()
603612
```

lectures/networks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -329,7 +329,7 @@ For example,
329329
```{code-cell} ipython3
330330
G_p.in_degree('p')
331331
```
332-
332+
(strongly_connected)=
333333
### Communication
334334

335335
Next, we study communication and connectedness, which have important

0 commit comments

Comments
 (0)