Skip to content

Commit 53c29f5

Browse files
CopilotTomeHirata
andauthored
Add documentation for provider-side prompt caching with Anthropic and OpenAI (#8970)
* Initial plan * Add documentation for provider-side prompt caching Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Remove unnecessary paragraph from prompt caching documentation Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Simplify prompt caching documentation by consolidating provider sections Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Remove additional configuration options section Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Remove duplicated example and redundant explanation Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> * Add reference to LiteLLM prompt caching documentation Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: TomeHirata <33407409+TomeHirata@users.noreply.github.com>
1 parent 8a7bcfd commit 53c29f5

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

docs/docs/tutorials/cache/index.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,39 @@ Time elapse: 0.000529
4848
Total usage: {}
4949
```
5050

51+
## Using Provider-Side Prompt Caching
52+
53+
In addition to DSPy's built-in caching mechanism, you can leverage provider-side prompt caching offered by LLM providers like Anthropic and OpenAI. This feature is particularly useful when working with modules like `dspy.ReAct()` that send similar prompts repeatedly, as it reduces both latency and costs by caching prompt prefixes on the provider's servers.
54+
55+
You can enable prompt caching by passing the `cache_control_injection_points` parameter to `dspy.LM()`. This works with supported providers like Anthropic and OpenAI. For more details on this feature, see the [LiteLLM prompt caching documentation](https://docs.litellm.ai/docs/tutorials/prompt_caching#configuration).
56+
57+
```python
58+
import dspy
59+
import os
60+
61+
os.environ["ANTHROPIC_API_KEY"] = "{your_anthropic_key}"
62+
lm = dspy.LM(
63+
"anthropic/claude-3-5-sonnet-20240620",
64+
cache_control_injection_points=[
65+
{
66+
"location": "message",
67+
"role": "system",
68+
}
69+
],
70+
)
71+
dspy.configure(lm=lm)
72+
73+
# Use with any DSPy module
74+
predict = dspy.Predict("question->answer")
75+
result = predict(question="What is the capital of France?")
76+
```
77+
78+
This is especially beneficial when:
79+
80+
- Using `dspy.ReAct()` with the same instructions
81+
- Working with long system prompts that remain constant
82+
- Making multiple requests with similar context
83+
5184
## Disabling/Enabling DSPy Cache
5285

5386
There are scenarios where you might need to disable caching, either entirely or selectively for in-memory or on-disk caches. For instance:

0 commit comments

Comments
 (0)