pydantic
diff --git a/‎docs/dependencies.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/dependencies.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/gateway.md‎
Lines changed: 27 additions & 3 deletions b/‎docs/gateway.md‎
Lines changed: 27 additions & 3 deletions
diff --git a/‎docs/models/anthropic.md‎
Lines changed: 67 additions & 0 deletions b/‎docs/models/anthropic.md‎
Lines changed: 67 additions & 0 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/__init__.py‎
Lines changed: 2 additions & 0 deletions b/‎pydantic_ai_slim/pydantic_ai/__init__.py‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py‎
Lines changed: 58 additions & 108 deletions b/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py‎
Lines changed: 58 additions & 108 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py‎
Lines changed: 1 addition & 1 deletion b/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_function_toolset.py‎
Lines changed: 1 addition & 1 deletion b/‎pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_function_toolset.py‎
Lines changed: 1 addition & 1 deletion
@@ -2,7 +2,7 @@
 
 Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](agents.md#system-prompts), [tools](tools.md) and [output validators](output.md#output-validator-functions).
 
-Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production.
+Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable, easier to test, and ultimately easier to deploy in production.
 
 ## Defining Dependencies
 
@@ -103,11 +103,11 @@ _(This example is complete, it can be run "as is" — you'll need to add `asynci
 [System prompt functions](agents.md#system-prompts), [function tools](tools.md) and [output validators](output.md#output-validator-functions) are all run in the async context of an agent run.
 
 If these functions are not coroutines (e.g. `async def`) they are called with
-[`run_in_executor`][asyncio.loop.run_in_executor] in a thread pool, it's therefore marginally preferable
+[`run_in_executor`][asyncio.loop.run_in_executor] in a thread pool. It's therefore marginally preferable
 to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too.
 
 !!! note "`run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies"
-    Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper around `run` and agents are always run in an async context.
+    Whether you use synchronous or asynchronous dependencies is completely independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper around `run` and agents are always run in an async context.
 
 Here's the same example as above, but with a synchronous dependency:
 
 
@@ -54,22 +54,31 @@ Choose a name for your organization (or accept the default). You will automatica
 A default project will be created for you. You can choose to use it, or create a new one on the [Projects](https://gateway.pydantic.dev/admin/projects) page.
 
 ### Add **Providers**
+
 There are two ways to use Providers in the Pydantic AI Gateway: you can bring your own key (BYOK) or buy inference through the platform.
 
 #### Bringing your own API key (BYOK)
 
-On the [Providers](https://gateway.pydantic.dev/admin/providers) page, fill in the form to add a provider. Paste your API key into the form under Credentials, and make sure to **select the Project that will be associated to this provider**. It is possible to add multiple keys from the same provider.
+On the [Providers](https://gateway.pydantic.dev/admin/providers) page, fill in the form to add a provider.
+Paste your API key into the form under Credentials, and make sure to **select the Project that will be associated to this provider**.
+It is possible to add multiple keys from the same provider.
 
 #### Use Built-in Providers
-Go to the Billing page, add a payment method, and purchase $15 in credits to activate built-in providers. This gives you single-key access to all available models from OpenAI, Anthropic, Google Vertex, AWS Bedrock, and Groq.
+
+Go to the [Billing page](https://gateway.pydantic.dev/admin/billing), add a payment method, and purchase $15 in credits to activate built-in providers.
+This gives you single-key access to all available models from OpenAI, Anthropic, Google Vertex, AWS Bedrock, and Groq.
 
 ### Grant access to your team
+
 On the [Users](https://gateway.pydantic.dev/admin/users) page, create an invitation and share the URL with your team to allow them to access the project.
 
 ### Create Gateway project keys
-On the Keys page, Admins can create project keys which are not affected by spending limits. Users can only create personal keys, that will inherit spending caps from both User and Project levels, whichever is more restrictive.
+
+On the Keys page, Admins can create project keys which are not affected by spending limits.
+Users can only create personal keys, that will inherit spending caps from both User and Project levels, whichever is more restrictive.
 
 ## Usage
+
 After setting up your account with the instructions above, you will be able to make an AI model request with the Pydantic AI Gateway.
 The code snippets below show how you can use PAIG with different frameworks and SDKs.
 You can add `gateway/` as prefix on every known provider that
@@ -87,6 +96,7 @@ Examples of providers and models that can be used are:
 | AWS Bedrock | `bedrock`       | `gateway/bedrock:amazon.nova-micro-v1:0` |
 
 ### Pydantic AI
+
 Before you start, make sure you are on version 1.16 or later of `pydantic-ai`. To update to the latest version run:
 
 === "uv"
@@ -123,6 +133,7 @@ The first known use of "hello, world" was in a 1974 textbook about the C program
 
 
 ### Claude Code
+
 Before you start, log out of Claude Code using `/logout`.
 
 Set your gateway credentials as environment variables:
@@ -174,3 +185,16 @@ response = client.messages.create(
 print(response.content[0].text)
 #> Hello user
 ```
+
+## Troubleshooting
+
+### Unable to calculate spend
+
+The gateway needs to know the cost of the request in order to provide insights about the spend, and to enforce spending limits.
+If it's unable to calculate the cost, it will return a 400 error with the message "Unable to calculate spend".
+
+When [configuring a provider](https://gateway.pydantic.dev/admin/providers/new), you need to decide if you want the gateway to block
+the API key if it's unable to calculate the cost. If you choose to block the API key, any further requests using that API key will fail.
+
+We are actively working on supporting more providers, and models.
+If you have a specific provider that you would like to see supported, please let us know on [Slack](https://logfire.pydantic.dev/docs/join-slack/) or [open an issue on `genai-prices`](https://github.com/pydantic/genai-prices/issues/new).
@@ -77,3 +77,70 @@ model = AnthropicModel(
 agent = Agent(model)
 ...
 ```
+
+## Prompt Caching
+
+Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
+
+1. **Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
+2. **Cache System Instructions**: Enable the [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] [model setting](../agents.md#model-run-settings) to cache your system prompt
+3. **Cache Tool Definitions**: Enable the [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] [model setting](../agents.md#model-run-settings) to cache your tool definitions
+
+You can combine all three strategies for maximum savings:
+
+```python {test="skip"}
+from pydantic_ai import Agent, CachePoint, RunContext
+from pydantic_ai.models.anthropic import AnthropicModelSettings
+
+agent = Agent(
+    'anthropic:claude-sonnet-4-5',
+    system_prompt='Detailed instructions...',
+    model_settings=AnthropicModelSettings(
+        anthropic_cache_instructions=True,
+        anthropic_cache_tool_definitions=True,
+    ),
+)
+
+@agent.tool
+def search_docs(ctx: RunContext, query: str) -> str:
+    """Search documentation."""
+    return f'Results for {query}'
+
+async def main():
+    # First call - writes to cache
+    result1 = await agent.run([
+        'Long context from documentation...',
+        CachePoint(),
+        'First question'
+    ])
+
+    # Subsequent calls - read from cache (90% cost reduction)
+    result2 = await agent.run([
+        'Long context from documentation...',  # Same content
+        CachePoint(),
+        'Second question'
+    ])
+    print(f'First: {result1.output}')
+    print(f'Second: {result2.output}')
+```
+
+Access cache usage statistics via `result.usage()`:
+
+```python {test="skip"}
+from pydantic_ai import Agent
+from pydantic_ai.models.anthropic import AnthropicModelSettings
+
+agent = Agent(
+    'anthropic:claude-sonnet-4-5',
+    system_prompt='Instructions...',
+    model_settings=AnthropicModelSettings(
+        anthropic_cache_instructions=True
+    ),
+)
+
+async def main():
+    result = await agent.run('Your question')
+    usage = result.usage()
+    print(f'Cache write tokens: {usage.cache_write_tokens}')
+    print(f'Cache read tokens: {usage.cache_read_tokens}')
+```
@@ -42,6 +42,7 @@
     BinaryImage,
     BuiltinToolCallPart,
     BuiltinToolReturnPart,
+    CachePoint,
     DocumentFormat,
     DocumentMediaType,
     DocumentUrl,
@@ -141,6 +142,7 @@
     'BinaryContent',
     'BuiltinToolCallPart',
     'BuiltinToolReturnPart',
+    'CachePoint',
     'DocumentFormat',
     'DocumentMediaType',
     'DocumentUrl',
 
@@ -1,24 +1,14 @@
 from __future__ import annotations
 
 import warnings
-from collections.abc import AsyncIterator, Callable, Sequence
-from contextlib import AbstractAsyncContextManager
 from dataclasses import replace
 from typing import Any
 
 from pydantic.errors import PydanticUserError
-from temporalio.client import ClientConfig, Plugin as ClientPlugin, WorkflowHistory
 from temporalio.contrib.pydantic import PydanticPayloadConverter, pydantic_data_converter
 from temporalio.converter import DataConverter, DefaultPayloadConverter
-from temporalio.service import ConnectConfig, ServiceClient
-from temporalio.worker import (
-    Plugin as WorkerPlugin,
-    Replayer,
-    ReplayerConfig,
-    Worker,
-    WorkerConfig,
-    WorkflowReplayResult,
-)
+from temporalio.plugin import SimplePlugin
+from temporalio.worker import WorkflowRunner
 from temporalio.worker.workflow_sandbox import SandboxedWorkflowRunner
 
 from ...exceptions import UserError
@@ -48,104 +38,64 @@
     pass
 
 
-class PydanticAIPlugin(ClientPlugin, WorkerPlugin):
+def _data_converter(converter: DataConverter | None) -> DataConverter:
+    if converter and converter.payload_converter_class not in (
+        DefaultPayloadConverter,
+        PydanticPayloadConverter,
+    ):
+        warnings.warn(  # pragma: no cover
+            'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.'
+        )
+
+    return pydantic_data_converter
+
+
+def _workflow_runner(runner: WorkflowRunner | None) -> WorkflowRunner:
+    if not runner:
+        raise ValueError('No WorkflowRunner provided to the Pydantic AI plugin.')  # pragma: no cover
+
+    if not isinstance(runner, SandboxedWorkflowRunner):
+        return runner  # pragma: no cover
+
+    return replace(
+        runner,
+        restrictions=runner.restrictions.with_passthrough_modules(
+            'pydantic_ai',
+            'pydantic',
+            'pydantic_core',
+            'logfire',
+            'rich',
+            'httpx',
+            'anyio',
+            'httpcore',
+            # Used by fastmcp via py-key-value-aio
+            'beartype',
+            # Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize
+            'attrs',
+            # Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize
+            'numpy',
+            'pandas',
+        ),
+    )
+
+
+class PydanticAIPlugin(SimplePlugin):
     """Temporal client and worker plugin for Pydantic AI."""
 
-    def init_client_plugin(self, next: ClientPlugin) -> None:
-        self.next_client_plugin = next
-
-    def init_worker_plugin(self, next: WorkerPlugin) -> None:
-        self.next_worker_plugin = next
-
-    def configure_client(self, config: ClientConfig) -> ClientConfig:
-        config['data_converter'] = self._get_new_data_converter(config.get('data_converter'))
-        return self.next_client_plugin.configure_client(config)
-
-    def configure_worker(self, config: WorkerConfig) -> WorkerConfig:
-        runner = config.get('workflow_runner')  # pyright: ignore[reportUnknownMemberType]
-        if isinstance(runner, SandboxedWorkflowRunner):  # pragma: no branch
-            config['workflow_runner'] = replace(
-                runner,
-                restrictions=runner.restrictions.with_passthrough_modules(
-                    'pydantic_ai',
-                    'pydantic',
-                    'pydantic_core',
-                    'logfire',
-                    'rich',
-                    'httpx',
-                    'anyio',
-                    'httpcore',
-                    # Used by fastmcp via py-key-value-aio
-                    'beartype',
-                    # Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize
-                    'attrs',
-                    # Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize
-                    'numpy',
-                    'pandas',
-                ),
-            )
-
-        config['workflow_failure_exception_types'] = [
-            *config.get('workflow_failure_exception_types', []),  # pyright: ignore[reportUnknownMemberType]
-            UserError,
-            PydanticUserError,
-        ]
-
-        return self.next_worker_plugin.configure_worker(config)
-
-    async def connect_service_client(self, config: ConnectConfig) -> ServiceClient:
-        return await self.next_client_plugin.connect_service_client(config)
-
-    async def run_worker(self, worker: Worker) -> None:
-        await self.next_worker_plugin.run_worker(worker)
-
-    def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig:  # pragma: no cover
-        config['data_converter'] = self._get_new_data_converter(config.get('data_converter'))  # pyright: ignore[reportUnknownMemberType]
-        return self.next_worker_plugin.configure_replayer(config)
-
-    def run_replayer(
-        self,
-        replayer: Replayer,
-        histories: AsyncIterator[WorkflowHistory],
-    ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]:  # pragma: no cover
-        return self.next_worker_plugin.run_replayer(replayer, histories)
-
-    def _get_new_data_converter(self, converter: DataConverter | None) -> DataConverter:
-        if converter and converter.payload_converter_class not in (
-            DefaultPayloadConverter,
-            PydanticPayloadConverter,
-        ):
-            warnings.warn(  # pragma: no cover
-                'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.'
-            )
-
-        return pydantic_data_converter
-
-
-class AgentPlugin(WorkerPlugin):
-    """Temporal worker plugin for a specific Pydantic AI agent."""
-
-    def __init__(self, agent: TemporalAgent[Any, Any]):
-        self.agent = agent
-
-    def init_worker_plugin(self, next: WorkerPlugin) -> None:
-        self.next_worker_plugin = next
+    def __init__(self):
+        super().__init__(  # type: ignore[reportUnknownMemberType]
+            name='PydanticAIPlugin',
+            data_converter=_data_converter,
+            workflow_runner=_workflow_runner,
+            workflow_failure_exception_types=[UserError, PydanticUserError],
+        )
 
-    def configure_worker(self, config: WorkerConfig) -> WorkerConfig:
-        activities: Sequence[Callable[..., Any]] = config.get('activities', [])  # pyright: ignore[reportUnknownMemberType]
-        # Activities are checked for name conflicts by Temporal.
-        config['activities'] = [*activities, *self.agent.temporal_activities]
-        return self.next_worker_plugin.configure_worker(config)
 
-    async def run_worker(self, worker: Worker) -> None:
-        await self.next_worker_plugin.run_worker(worker)
-
-    def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig:  # pragma: no cover
-        return self.next_worker_plugin.configure_replayer(config)
+class AgentPlugin(SimplePlugin):
+    """Temporal worker plugin for a specific Pydantic AI agent."""
 
-    def run_replayer(
-        self,
-        replayer: Replayer,
-        histories: AsyncIterator[WorkflowHistory],
-    ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]:  # pragma: no cover
-        return self.next_worker_plugin.run_replayer(replayer, histories)
+    def __init__(self, agent: TemporalAgent[Any, Any]):
+        super().__init__(  # type: ignore[reportUnknownMemberType]
+            name='AgentPlugin',
+            activities=agent.temporal_activities,
+        )
@@ -219,7 +219,7 @@ async def _call_event_stream_handler_activity(
     ) -> None:
         serialized_run_context = self.run_context_type.serialize_run_context(ctx)
         async for event in stream:
-            await workflow.execute_activity(  # pyright: ignore[reportUnknownMemberType]
+            await workflow.execute_activity(
                 activity=self.event_stream_handler_activity,
                 args=[
                     _EventStreamHandlerParams(
 
@@ -81,7 +81,7 @@ async def call_tool(
         tool_activity_config = self.activity_config | tool_activity_config
         serialized_run_context = self.run_context_type.serialize_run_context(ctx)
         return self._unwrap_call_tool_result(
-            await workflow.execute_activity(  # pyright: ignore[reportUnknownMemberType]
+            await workflow.execute_activity(
                 activity=self.call_tool_activity,
                 args=[
                     CallToolParams(