Skip to content

Commit d79e714

Browse files
committed
fix minor typos
1 parent 71bd205 commit d79e714

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

lectures/mccall_q.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ To set up such an algorithm, we first define some errors or "differences"
383383
$$
384384
\begin{aligned}
385385
w & + \beta \max_{\textrm{accept, reject}} \left\{ \hat Q_t (w_t, \textrm{accept}), \hat Q_t(w_t, \textrm{reject}) \right\} - \hat Q_t(w_t, \textrm{accept}) = \textrm{diff}_{\textrm{accept},t} \cr
386-
c & +\beta\int\max_{\text{accept, reject}}\left\{ \hat Q_t(w_{t+1}, \textrm{accept}),\hat Q_t\left(w_{t+1},\text{reject}\right)\right\} - \hat Q_t\left(w_t,\text{reject}\right) = \textrm{diff}_{\textrm{reject},t} \cr
386+
c & +\beta \max_{\text{accept, reject}}\left\{ \hat Q_t(w_{t+1}, \textrm{accept}),\hat Q_t\left(w_{t+1},\text{reject}\right)\right\} - \hat Q_t\left(w_t,\text{reject}\right) = \textrm{diff}_{\textrm{reject},t} \cr
387387
\end{aligned}
388388
$$ (eq:old105)
389389
@@ -734,7 +734,7 @@ The above graphs indicates that
734734
## Employed Worker Can't Quit
735735
736736
737-
The preceding version of temporal difference Q-learning described in equation system (4) lets an an employed worker quit, i.e., reject her wage as an incumbent and instead receive unemployment compensation this period
737+
The preceding version of temporal difference Q-learning described in equation system {eq}`eq:old4` lets an employed worker quit, i.e., reject her wage as an incumbent and instead receive unemployment compensation this period
738738
and draw a new offer next period.
739739
740740
This is an option that the McCall worker described in {doc}`this quantecon lecture <mccall_model>` would not take.
@@ -756,11 +756,11 @@ $$
756756
\end{aligned}
757757
$$ (eq:temp-diff)
758758
759-
It turns out that formulas {eq}`eq:temp-diff` combined with our Q-learning recursion (3) can lead our agent to eventually learn the optimal value function as well as in the case where an option to redraw can be exercised.
759+
It turns out that formulas {eq}`eq:temp-diff` combined with our Q-learning recursion {eq}`eq:old3` can lead our agent to eventually learn the optimal value function as well as in the case where an option to redraw can be exercised.
760760
761761
But learning is slower because an agent who ends up accepting a wage offer prematurally loses the option to explore new states in the same episode and to adjust the value associated with that state.
762762
763-
This can leads to inferior outcomes when the number of epochs/episods is low.
763+
This can lead to inferior outcomes when the number of epochs/episods is low.
764764
765765
But if we increase the numb
766766

0 commit comments

Comments
 (0)