Skip to content

Commit 2cd5b19

Browse files
committed
update graphs in markov chains
1 parent 1a1e771 commit 2cd5b19

File tree

1 file changed

+45
-52
lines changed

1 file changed

+45
-52
lines changed

lectures/markov_chains_II.md

Lines changed: 45 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ jupytext:
44
extension: .md
55
format_name: myst
66
format_version: 0.13
7-
jupytext_version: 1.14.4
7+
jupytext_version: 1.14.5
88
kernelspec:
99
display_name: Python 3 (ipykernel)
1010
language: python
@@ -37,24 +37,19 @@ to be installed on your computer. Installation instructions for graphviz can be
3737
[here](https://www.graphviz.org/download/)
3838
```
3939

40-
4140
## Overview
4241

43-
This lecture continues on from our {doc}`earlier lecture on Markov chains
44-
<markov_chains_I>`.
45-
42+
This lecture continues our journey in Markov chains.
4643

47-
Specifically, we will introduce the concepts of irreducibility and ergodicity, and see how they connect to stationarity.
44+
Specifically, we will introduce irreducibility and ergodicity, and how they connect to stationarity.
4845

49-
Irreducibility describes the ability of a Markov chain to move between any two states in the system.
46+
Irreducibility is a concept that describes the ability of a Markov chain to move between any two states in the system.
5047

5148
Ergodicity is a sample path property that describes the behavior of the system over long periods of time.
5249

53-
As we will see,
50+
The concepts of irreducibility and ergodicity are closely related to the idea of stationarity.
5451

55-
* an irreducible Markov chain guarantees the existence of a unique stationary distribution, while
56-
* an ergodic Markov chain generates time series that satisfy a version of the
57-
law of large numbers.
52+
An irreducible Markov chain guarantees the existence of a unique stationary distribution, while an ergodic Markov chain ensures that the system eventually reaches its stationary distribution, regardless of its initial state.
5853

5954
Together, these concepts provide a foundation for understanding the long-term behavior of Markov chains.
6055

@@ -74,7 +69,9 @@ import matplotlib as mpl
7469
## Irreducibility
7570

7671

77-
To explain irreducibility, let's take $P$ to be a fixed stochastic matrix.
72+
Irreducibility is a central concept of Markov chain theory.
73+
74+
To explain it, let's take $P$ to be a fixed stochastic matrix.
7875

7976
Two states $x$ and $y$ are said to **communicate** with each other if
8077
there exist positive integers $j$ and $k$ such that
@@ -179,8 +176,6 @@ mc = qe.MarkovChain(P, ('poor', 'middle', 'rich'))
179176
mc.is_irreducible
180177
```
181178

182-
+++ {"user_expressions": []}
183-
184179
It might be clear to you already that irreducibility is going to be important
185180
in terms of long-run outcomes.
186181

@@ -270,19 +265,19 @@ In view of our latest (ergodicity) result, it is also the fraction of time that
270265

271266
Thus, in the long run, cross-sectional averages for a population and time-series averages for a given person coincide.
272267

273-
This is one aspect of the concept of ergodicity.
268+
This is one aspect of the concept of ergodicity.
274269

275270

276271
(ergo)=
277272
### Example 2
278273

279-
Another example is the Hamilton dynamics we {ref}`discussed before <mc_eg2>`.
274+
Another example is Hamilton {cite}`Hamilton2005` dynamics {ref}`discussed before <mc_eg2>`.
275+
276+
The diagram of the Markov chain shows that it is **irreducible**.
280277

281-
The {ref}`graph <mc_eg2>` of the Markov chain shows it is irreducible
278+
Therefore, we can see the sample path averages for each state (the fraction of time spent in each state) converges to the stationary distribution regardless of the starting state
282279

283-
Therefore, we can see the sample path averages for each state (the fraction of
284-
time spent in each state) converges to the stationary distribution regardless of
285-
the starting state
280+
Let's denote the fraction of time spent in state $x$ at time $t$ in our sample path as $\hat p_t(x)$ and compare it with the stationary distribution $\psi^* (x)$
286281

287282
```{code-cell} ipython3
288283
P = np.array([[0.971, 0.029, 0.000],
@@ -291,27 +286,28 @@ P = np.array([[0.971, 0.029, 0.000],
291286
ts_length = 10_000
292287
mc = qe.MarkovChain(P)
293288
n = len(P)
294-
fig, axes = plt.subplots(nrows=1, ncols=n)
289+
fig, axes = plt.subplots(nrows=1, ncols=n, figsize=(15, 6))
295290
ψ_star = mc.stationary_distributions[0]
296291
plt.subplots_adjust(wspace=0.35)
297292
298293
for i in range(n):
299-
axes[i].grid()
300-
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color = 'black',
294+
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
301295
label = fr'$\psi^*({i})$')
302296
axes[i].set_xlabel('t')
303-
axes[i].set_ylabel(f'fraction of time spent at {i}')
297+
axes[i].set_ylabel(fr'$\hat p_t({i})$')
304298
305299
# Compute the fraction of time spent, starting from different x_0s
306300
for x0, col in ((0, 'blue'), (1, 'green'), (2, 'red')):
307301
# Generate time series that starts at different x0
308302
X = mc.simulate(ts_length, init=x0)
309-
X_bar = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
310-
axes[i].plot(X_bar, color=col, label=f'$x_0 = \, {x0} $')
303+
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
304+
axes[i].plot(p_hat, color=col, label=f'$x_0 = \, {x0} $')
311305
axes[i].legend()
312306
plt.show()
313307
```
314308

309+
Note that the convergence to the stationary distribution regardless of the starting point $x_0$.
310+
315311
### Example 3
316312

317313
Let's look at one more example with six states {ref}`discussed before <mc_eg3>`.
@@ -330,11 +326,9 @@ P :=
330326
$$
331327

332328

333-
The {ref}`graph <mc_eg3>` for the chain shows all states are reachable,
334-
indicating that this chain is irreducible.
329+
The graph for the chain shows states are densely connected indicating that it is **irreducible**.
335330

336-
Similar to previous examples, the sample path averages for each state converge
337-
to the stationary distribution.
331+
Then we visualize the difference between $\hat p_t(x)$ and the stationary distribution $\psi^* (x)$
338332

339333
```{code-cell} ipython3
340334
P = [[0.86, 0.11, 0.03, 0.00, 0.00, 0.00],
@@ -351,20 +345,22 @@ fig, ax = plt.subplots(figsize=(9, 6))
351345
X = mc.simulate(ts_length)
352346
# Center the plot at 0
353347
ax.set_ylim(-0.25, 0.25)
354-
ax.axhline(0, linestyle='dashed', lw=2, color = 'black', alpha=0.4)
348+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
355349
356350
357351
for x0 in range(6):
358352
# Calculate the fraction of time for each state
359-
X_bar = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
360-
ax.plot(X_bar - ψ_star[x0], label=f'$X = {x0+1} $')
353+
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
354+
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
361355
ax.set_xlabel('t')
362-
ax.set_ylabel(r'fraction of time spent in a state $- \psi^* (x)$')
356+
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')
363357
364358
ax.legend()
365359
plt.show()
366360
```
367361

362+
Similar to previous examples, the sample path averages for each state converge to the stationary distribution as the trend converge towards 0.
363+
368364
### Example 4
369365

370366
Let's look at another example with two states: 0 and 1.
@@ -395,8 +391,7 @@ dot.edge("1", "0", label="1.0", color='red')
395391
dot
396392
```
397393

398-
399-
In fact it has a periodic cycle --- the state cycles between the two states in a regular way.
394+
Unlike other Markov chains we have seen before, it has a periodic cycle --- the state cycles between the two states in a regular way.
400395

401396
This is called [periodicity](https://www.randomservices.org/random/markov/Periodicity.html).
402397

@@ -412,19 +407,18 @@ fig, axes = plt.subplots(nrows=1, ncols=n)
412407
ψ_star = mc.stationary_distributions[0]
413408
414409
for i in range(n):
415-
axes[i].grid()
416410
axes[i].set_ylim(0.45, 0.55)
417-
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color = 'black',
411+
axes[i].axhline(ψ_star[i], linestyle='dashed', lw=2, color='black',
418412
label = fr'$\psi^*({i})$')
419413
axes[i].set_xlabel('t')
420-
axes[i].set_ylabel(f'fraction of time spent at {i}')
414+
axes[i].set_ylabel(fr'$\hat p_t({i})$')
421415
422416
# Compute the fraction of time spent, for each x
423417
for x0 in range(n):
424418
# Generate time series starting at different x_0
425419
X = mc.simulate(ts_length, init=x0)
426-
X_bar = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
427-
axes[i].plot(X_bar, label=f'$x_0 = \, {x0} $')
420+
p_hat = (X == i).cumsum() / (1 + np.arange(ts_length, dtype=float))
421+
axes[i].plot(p_hat, label=f'$x_0 = \, {x0} $')
428422
429423
axes[i].legend()
430424
plt.show()
@@ -436,8 +430,6 @@ The proportion of time spent in a state can converge to the stationary distribut
436430

437431
However, the distribution at each state does not.
438432

439-
+++ {"user_expressions": []}
440-
441433
### Expectations of geometric sums
442434

443435
Sometimes we want to compute the mathematical expectation of a geometric sum, such as
@@ -553,14 +545,14 @@ mc = qe.MarkovChain(P)
553545
fig, ax = plt.subplots(figsize=(9, 6))
554546
X = mc.simulate(ts_length)
555547
ax.set_ylim(-0.25, 0.25)
556-
ax.axhline(0, linestyle='dashed', lw=2, color = 'black', alpha=0.4)
548+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
557549
558550
for x0 in range(8):
559551
# Calculate the fraction of time for each worker
560-
X_bar = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
561-
ax.plot(X_bar - ψ_star[x0], label=f'$X = {x0+1} $')
552+
p_hat = (X == x0).cumsum() / (1 + np.arange(ts_length, dtype=float))
553+
ax.plot(p_hat - ψ_star[x0], label=f'$x = {x0+1} $')
562554
ax.set_xlabel('t')
563-
ax.set_ylabel(r'fraction of time spent in a state $- \psi^* (x)$')
555+
ax.set_ylabel(r'$\hat p_t(x) - \psi^* (x)$')
564556
565557
ax.legend()
566558
plt.show()
@@ -616,7 +608,7 @@ The result should be similar to the plot we plotted [here](ergo)
616608

617609
We will address this exercise graphically.
618610

619-
The plots show the time series of $\bar X_m - p$ for two initial
611+
The plots show the time series of $\bar{\{X=x\}}_m - p$ for two initial
620612
conditions.
621613

622614
As $m$ gets large, both series converge to zero.
@@ -632,8 +624,7 @@ mc = qe.MarkovChain(P)
632624
633625
fig, ax = plt.subplots(figsize=(9, 6))
634626
ax.set_ylim(-0.25, 0.25)
635-
ax.grid()
636-
ax.hlines(0, 0, ts_length, lw=2, alpha=0.6) # Horizonal line at zero
627+
ax.axhline(0, linestyle='dashed', lw=2, color='black', alpha=0.4)
637628
638629
for x0, col in ((0, 'blue'), (1, 'green')):
639630
# Generate time series for worker that starts at x0
@@ -642,10 +633,12 @@ for x0, col in ((0, 'blue'), (1, 'green')):
642633
X_bar = (X == 0).cumsum() / (1 + np.arange(ts_length, dtype=float))
643634
# Plot
644635
ax.fill_between(range(ts_length), np.zeros(ts_length), X_bar - p, color=col, alpha=0.1)
645-
ax.plot(X_bar - p, color=col, label=f'$X_0 = \, {x0} $')
636+
ax.plot(X_bar - p, color=col, label=f'$x_0 = \, {x0} $')
646637
# Overlay in black--make lines clearer
647638
ax.plot(X_bar - p, 'k-', alpha=0.6)
648-
639+
ax.set_xlabel('t')
640+
ax.set_ylabel(r'$\bar X_m - \psi^* (x)$')
641+
649642
ax.legend(loc='upper right')
650643
plt.show()
651644
```

0 commit comments

Comments
 (0)