[Docs] Add eviction policy (#839)

oulgen · web-flow · commit 94d73aafa653 · 2025-10-08T11:01:28.000-07:00
diff --git a/README.md b/README.md
@@ -263,6 +263,12 @@ Changing these options results in often significantly different
 output Triton code, allowing the autotuner to explore a wide range of
 implementations from a single Helion kernel.
 
+* **load_eviction_policies** (`list[str]`):
+Controls eviction policy used for loads discovered in device loops. Each entry
+corresponds to a load site; allowed values are `""` (no policy), `"first"`
+(maps to Triton `evict_first`), and `"last"` (maps to Triton `evict_last`).
+Explicit `eviction_policy=...` on `hl.load` overrides this config.
+
 ## Settings for Development and Debugging
 
 When developing kernels with Helion, you might prefer skipping autotuning for faster iteration. To
diff --git a/docs/api/config.md b/docs/api/config.md
@@ -116,6 +116,24 @@ Configs are typically discovered automatically through autotuning, but can also
    - ``"block_ptr"``: Block pointer indexing
 ```
 
+### Memory and Caching
+
+```{eval-rst}
+.. autoattribute:: Config.load_eviction_policies
+
+   Eviction policies for load operations issued from device loops. Provide one policy
+   per ``hl.load`` site discovered in the kernel. Allowed values:
+
+   - ``""``: No eviction policy (omitted)
+   - ``"first"``: Maps to Triton ``eviction_policy='evict_first'``
+   - ``"last"``: Maps to Triton ``eviction_policy='evict_last'``
+
+   Notes:
+
+   - The number of entries must match the number of load sites considered tunable by the kernel.
+   - An explicit ``eviction_policy=...`` argument passed to ``hl.load`` overrides this config.
+```
+
 ## Usage Examples
 
 ### Manual Config Creation
@@ -142,6 +160,31 @@ def my_kernel(x: torch.Tensor) -> torch.Tensor:
     return result
 ```
 
+### Eviction Policy Example
+
+```python
+import torch
+import helion
+import helion.language as hl
+
+@helion.kernel(
+    config={
+        "block_size": 16,
+        "load_eviction_policies": ["", "last"],  # second load uses evict_last
+    }
+)
+def kernel_with_eviction(x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
+    out = torch.empty_like(x)
+    for tile in hl.tile(x.size(0)):
+        a = hl.load(x, [tile])                 # No eviction policy
+        b = hl.load(y, [tile])                 # Will use evict_last from config
+        out[tile] = a + b
+    return out
+
+# Explicit policy on hl.load overrides config:
+# hl.load(x, [tile], eviction_policy="evict_first")
+```
+
 ### Config Serialization
 
 ```python