You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimize Aiyagari model: Switch to VFI with JIT-compiled lax.while_loop (#674)
* Optimize Aiyagari model: use VFI with JIT-compiled lax.while_loop
This commit significantly improves the performance and code quality of the
Aiyagari model lecture by switching from Howard Policy Iteration (HPI) to
Value Function Iteration (VFI) as the primary solution method, with HPI
moved to an exercise.
Major changes:
- Replace HPI with VFI using jax.lax.while_loop and @jax.jit compilation
- Reduce asset grid size from 200 to 100 points for efficiency
- Reduce asset grid maximum from 20 to 12.5 (better suited for equilibrium)
- Use 'loop_state' instead of 'state' in loops to avoid DP terminology confusion
- Remove redundant @jax.jit decorators from helper functions (only on top-level functions)
- Move HPI implementation to Exercise 3 with complete solution
Performance improvements:
- VFI equilibrium computation: ~0.68 seconds (was ~11+ seconds with damped iteration)
- HPI in Exercise 3: ~0.48 seconds with optimized JIT compilation
- 85x speedup compared to unoptimized Python loops
Code quality improvements:
- Cleaner JIT compilation strategy (only on ultimate calling functions)
- Both VFI and HPI use compiled lax.while_loop for consistency
- Helper functions automatically inlined and optimized by JAX
- Clear separation of main content (VFI) and advanced material (HPI exercise)
Educational improvements:
- Students learn VFI first (simpler, more standard algorithm)
- HPI presented as advanced exercise with guidance and complete solution
- Exercise asks students to verify both methods produce same equilibrium
Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix broken reference in aiyagari.md: Replace opt_savings_2 with Dynamic Programming book link
Replace the broken cross-reference to opt_savings_2 (which doesn't exist in this PR) with a direct link to the Dynamic Programming book at dp.quantecon.org where Howard policy iteration is discussed in detail.
This fixes the build warning:
aiyagari.md:689: WARNING: unknown document: 'opt_savings_2'
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update aiyagari.md: Fix reference to VFI instead of HPI
Updated the "Primitives and operators" section to correctly state that we solve the household problem using value function iteration (VFI), not Howard policy iteration (HPI). Removed the outdated reference to Ch 5 of Dynamic Programming book.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
2. Set up the linear operator $R_{\sigma}$ where $(R_{\sigma} v)(a, z) = v(a, z) - \beta \sum_{z'} v(\sigma(a, z), z') \Pi(z, z')$
708
+
3. Solve $v_{\sigma} = R_{\sigma}^{-1} r_{\sigma}$ using `jax.scipy.sparse.linalg.bicgstab`
709
+
710
+
You can use the `get_greedy` function that's already defined in this lecture.
711
+
712
+
Implement the following Howard policy iteration routine:
713
+
714
+
```python
715
+
defhoward_policy_iteration(household, prices,
716
+
tol=1e-4, max_iter=10_000, verbose=False):
717
+
"""
718
+
Howard policy iteration routine.
719
+
"""
720
+
# Your code here
721
+
pass
722
+
```
723
+
724
+
Once implemented, compute the equilibrium capital stock using HPI and verify that it produces approximately the same result as VFI at the default parameter values.
725
+
726
+
```{exercise-end}
727
+
```
728
+
729
+
```{solution-start} aiyagari_ex3
730
+
:class: dropdown
731
+
```
732
+
733
+
First, we need to implement the helper functions for Howard policy iteration.
734
+
735
+
The following function computes the array $r_{\sigma}$ which gives current rewards given policy $\sigma$:
736
+
737
+
```{code-cell} ipython3
738
+
def compute_r_σ(σ, household, prices):
739
+
"""
740
+
Compute current rewards at each i, j under policy σ. In particular,
0 commit comments