Skip to content

Commit 091c605

Browse files
committed
Merge branch 'main' into feat/google-enhanced-json-schema
2 parents a27e8f2 + 359c6d2 commit 091c605

29 files changed

+1032
-444
lines changed

docs/dependencies.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](agents.md#system-prompts), [tools](tools.md) and [output validators](output.md#output-validator-functions).
44

5-
Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production.
5+
Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable, easier to test, and ultimately easier to deploy in production.
66

77
## Defining Dependencies
88

@@ -103,11 +103,11 @@ _(This example is complete, it can be run "as is" — you'll need to add `asynci
103103
[System prompt functions](agents.md#system-prompts), [function tools](tools.md) and [output validators](output.md#output-validator-functions) are all run in the async context of an agent run.
104104

105105
If these functions are not coroutines (e.g. `async def`) they are called with
106-
[`run_in_executor`][asyncio.loop.run_in_executor] in a thread pool, it's therefore marginally preferable
106+
[`run_in_executor`][asyncio.loop.run_in_executor] in a thread pool. It's therefore marginally preferable
107107
to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too.
108108

109109
!!! note "`run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies"
110-
Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use `run` or `run_sync``run_sync` is just a wrapper around `run` and agents are always run in an async context.
110+
Whether you use synchronous or asynchronous dependencies is completely independent of whether you use `run` or `run_sync``run_sync` is just a wrapper around `run` and agents are always run in an async context.
111111

112112
Here's the same example as above, but with a synchronous dependency:
113113

docs/gateway.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,22 +54,31 @@ Choose a name for your organization (or accept the default). You will automatica
5454
A default project will be created for you. You can choose to use it, or create a new one on the [Projects](https://gateway.pydantic.dev/admin/projects) page.
5555

5656
### Add **Providers**
57+
5758
There are two ways to use Providers in the Pydantic AI Gateway: you can bring your own key (BYOK) or buy inference through the platform.
5859

5960
#### Bringing your own API key (BYOK)
6061

61-
On the [Providers](https://gateway.pydantic.dev/admin/providers) page, fill in the form to add a provider. Paste your API key into the form under Credentials, and make sure to **select the Project that will be associated to this provider**. It is possible to add multiple keys from the same provider.
62+
On the [Providers](https://gateway.pydantic.dev/admin/providers) page, fill in the form to add a provider.
63+
Paste your API key into the form under Credentials, and make sure to **select the Project that will be associated to this provider**.
64+
It is possible to add multiple keys from the same provider.
6265

6366
#### Use Built-in Providers
64-
Go to the Billing page, add a payment method, and purchase $15 in credits to activate built-in providers. This gives you single-key access to all available models from OpenAI, Anthropic, Google Vertex, AWS Bedrock, and Groq.
67+
68+
Go to the [Billing page](https://gateway.pydantic.dev/admin/billing), add a payment method, and purchase $15 in credits to activate built-in providers.
69+
This gives you single-key access to all available models from OpenAI, Anthropic, Google Vertex, AWS Bedrock, and Groq.
6570

6671
### Grant access to your team
72+
6773
On the [Users](https://gateway.pydantic.dev/admin/users) page, create an invitation and share the URL with your team to allow them to access the project.
6874

6975
### Create Gateway project keys
70-
On the Keys page, Admins can create project keys which are not affected by spending limits. Users can only create personal keys, that will inherit spending caps from both User and Project levels, whichever is more restrictive.
76+
77+
On the Keys page, Admins can create project keys which are not affected by spending limits.
78+
Users can only create personal keys, that will inherit spending caps from both User and Project levels, whichever is more restrictive.
7179

7280
## Usage
81+
7382
After setting up your account with the instructions above, you will be able to make an AI model request with the Pydantic AI Gateway.
7483
The code snippets below show how you can use PAIG with different frameworks and SDKs.
7584
You can add `gateway/` as prefix on every known provider that
@@ -87,6 +96,7 @@ Examples of providers and models that can be used are:
8796
| AWS Bedrock | `bedrock` | `gateway/bedrock:amazon.nova-micro-v1:0` |
8897

8998
### Pydantic AI
99+
90100
Before you start, make sure you are on version 1.16 or later of `pydantic-ai`. To update to the latest version run:
91101

92102
=== "uv"
@@ -123,6 +133,7 @@ The first known use of "hello, world" was in a 1974 textbook about the C program
123133

124134

125135
### Claude Code
136+
126137
Before you start, log out of Claude Code using `/logout`.
127138

128139
Set your gateway credentials as environment variables:
@@ -174,3 +185,16 @@ response = client.messages.create(
174185
print(response.content[0].text)
175186
#> Hello user
176187
```
188+
189+
## Troubleshooting
190+
191+
### Unable to calculate spend
192+
193+
The gateway needs to know the cost of the request in order to provide insights about the spend, and to enforce spending limits.
194+
If it's unable to calculate the cost, it will return a 400 error with the message "Unable to calculate spend".
195+
196+
When [configuring a provider](https://gateway.pydantic.dev/admin/providers/new), you need to decide if you want the gateway to block
197+
the API key if it's unable to calculate the cost. If you choose to block the API key, any further requests using that API key will fail.
198+
199+
We are actively working on supporting more providers, and models.
200+
If you have a specific provider that you would like to see supported, please let us know on [Slack](https://logfire.pydantic.dev/docs/join-slack/) or [open an issue on `genai-prices`](https://github.com/pydantic/genai-prices/issues/new).

docs/models/anthropic.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,70 @@ model = AnthropicModel(
7777
agent = Agent(model)
7878
...
7979
```
80+
81+
## Prompt Caching
82+
83+
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
84+
85+
1. **Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86+
2. **Cache System Instructions**: Enable the [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] [model setting](../agents.md#model-run-settings) to cache your system prompt
87+
3. **Cache Tool Definitions**: Enable the [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] [model setting](../agents.md#model-run-settings) to cache your tool definitions
88+
89+
You can combine all three strategies for maximum savings:
90+
91+
```python {test="skip"}
92+
from pydantic_ai import Agent, CachePoint, RunContext
93+
from pydantic_ai.models.anthropic import AnthropicModelSettings
94+
95+
agent = Agent(
96+
'anthropic:claude-sonnet-4-5',
97+
system_prompt='Detailed instructions...',
98+
model_settings=AnthropicModelSettings(
99+
anthropic_cache_instructions=True,
100+
anthropic_cache_tool_definitions=True,
101+
),
102+
)
103+
104+
@agent.tool
105+
def search_docs(ctx: RunContext, query: str) -> str:
106+
"""Search documentation."""
107+
return f'Results for {query}'
108+
109+
async def main():
110+
# First call - writes to cache
111+
result1 = await agent.run([
112+
'Long context from documentation...',
113+
CachePoint(),
114+
'First question'
115+
])
116+
117+
# Subsequent calls - read from cache (90% cost reduction)
118+
result2 = await agent.run([
119+
'Long context from documentation...', # Same content
120+
CachePoint(),
121+
'Second question'
122+
])
123+
print(f'First: {result1.output}')
124+
print(f'Second: {result2.output}')
125+
```
126+
127+
Access cache usage statistics via `result.usage()`:
128+
129+
```python {test="skip"}
130+
from pydantic_ai import Agent
131+
from pydantic_ai.models.anthropic import AnthropicModelSettings
132+
133+
agent = Agent(
134+
'anthropic:claude-sonnet-4-5',
135+
system_prompt='Instructions...',
136+
model_settings=AnthropicModelSettings(
137+
anthropic_cache_instructions=True
138+
),
139+
)
140+
141+
async def main():
142+
result = await agent.run('Your question')
143+
usage = result.usage()
144+
print(f'Cache write tokens: {usage.cache_write_tokens}')
145+
print(f'Cache read tokens: {usage.cache_read_tokens}')
146+
```

pydantic_ai_slim/pydantic_ai/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
BinaryImage,
4343
BuiltinToolCallPart,
4444
BuiltinToolReturnPart,
45+
CachePoint,
4546
DocumentFormat,
4647
DocumentMediaType,
4748
DocumentUrl,
@@ -141,6 +142,7 @@
141142
'BinaryContent',
142143
'BuiltinToolCallPart',
143144
'BuiltinToolReturnPart',
145+
'CachePoint',
144146
'DocumentFormat',
145147
'DocumentMediaType',
146148
'DocumentUrl',
Lines changed: 58 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,14 @@
11
from __future__ import annotations
22

33
import warnings
4-
from collections.abc import AsyncIterator, Callable, Sequence
5-
from contextlib import AbstractAsyncContextManager
64
from dataclasses import replace
75
from typing import Any
86

97
from pydantic.errors import PydanticUserError
10-
from temporalio.client import ClientConfig, Plugin as ClientPlugin, WorkflowHistory
118
from temporalio.contrib.pydantic import PydanticPayloadConverter, pydantic_data_converter
129
from temporalio.converter import DataConverter, DefaultPayloadConverter
13-
from temporalio.service import ConnectConfig, ServiceClient
14-
from temporalio.worker import (
15-
Plugin as WorkerPlugin,
16-
Replayer,
17-
ReplayerConfig,
18-
Worker,
19-
WorkerConfig,
20-
WorkflowReplayResult,
21-
)
10+
from temporalio.plugin import SimplePlugin
11+
from temporalio.worker import WorkflowRunner
2212
from temporalio.worker.workflow_sandbox import SandboxedWorkflowRunner
2313

2414
from ...exceptions import UserError
@@ -48,104 +38,64 @@
4838
pass
4939

5040

51-
class PydanticAIPlugin(ClientPlugin, WorkerPlugin):
41+
def _data_converter(converter: DataConverter | None) -> DataConverter:
42+
if converter and converter.payload_converter_class not in (
43+
DefaultPayloadConverter,
44+
PydanticPayloadConverter,
45+
):
46+
warnings.warn( # pragma: no cover
47+
'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.'
48+
)
49+
50+
return pydantic_data_converter
51+
52+
53+
def _workflow_runner(runner: WorkflowRunner | None) -> WorkflowRunner:
54+
if not runner:
55+
raise ValueError('No WorkflowRunner provided to the Pydantic AI plugin.') # pragma: no cover
56+
57+
if not isinstance(runner, SandboxedWorkflowRunner):
58+
return runner # pragma: no cover
59+
60+
return replace(
61+
runner,
62+
restrictions=runner.restrictions.with_passthrough_modules(
63+
'pydantic_ai',
64+
'pydantic',
65+
'pydantic_core',
66+
'logfire',
67+
'rich',
68+
'httpx',
69+
'anyio',
70+
'httpcore',
71+
# Used by fastmcp via py-key-value-aio
72+
'beartype',
73+
# Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize
74+
'attrs',
75+
# Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize
76+
'numpy',
77+
'pandas',
78+
),
79+
)
80+
81+
82+
class PydanticAIPlugin(SimplePlugin):
5283
"""Temporal client and worker plugin for Pydantic AI."""
5384

54-
def init_client_plugin(self, next: ClientPlugin) -> None:
55-
self.next_client_plugin = next
56-
57-
def init_worker_plugin(self, next: WorkerPlugin) -> None:
58-
self.next_worker_plugin = next
59-
60-
def configure_client(self, config: ClientConfig) -> ClientConfig:
61-
config['data_converter'] = self._get_new_data_converter(config.get('data_converter'))
62-
return self.next_client_plugin.configure_client(config)
63-
64-
def configure_worker(self, config: WorkerConfig) -> WorkerConfig:
65-
runner = config.get('workflow_runner') # pyright: ignore[reportUnknownMemberType]
66-
if isinstance(runner, SandboxedWorkflowRunner): # pragma: no branch
67-
config['workflow_runner'] = replace(
68-
runner,
69-
restrictions=runner.restrictions.with_passthrough_modules(
70-
'pydantic_ai',
71-
'pydantic',
72-
'pydantic_core',
73-
'logfire',
74-
'rich',
75-
'httpx',
76-
'anyio',
77-
'httpcore',
78-
# Used by fastmcp via py-key-value-aio
79-
'beartype',
80-
# Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize
81-
'attrs',
82-
# Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize
83-
'numpy',
84-
'pandas',
85-
),
86-
)
87-
88-
config['workflow_failure_exception_types'] = [
89-
*config.get('workflow_failure_exception_types', []), # pyright: ignore[reportUnknownMemberType]
90-
UserError,
91-
PydanticUserError,
92-
]
93-
94-
return self.next_worker_plugin.configure_worker(config)
95-
96-
async def connect_service_client(self, config: ConnectConfig) -> ServiceClient:
97-
return await self.next_client_plugin.connect_service_client(config)
98-
99-
async def run_worker(self, worker: Worker) -> None:
100-
await self.next_worker_plugin.run_worker(worker)
101-
102-
def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover
103-
config['data_converter'] = self._get_new_data_converter(config.get('data_converter')) # pyright: ignore[reportUnknownMemberType]
104-
return self.next_worker_plugin.configure_replayer(config)
105-
106-
def run_replayer(
107-
self,
108-
replayer: Replayer,
109-
histories: AsyncIterator[WorkflowHistory],
110-
) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover
111-
return self.next_worker_plugin.run_replayer(replayer, histories)
112-
113-
def _get_new_data_converter(self, converter: DataConverter | None) -> DataConverter:
114-
if converter and converter.payload_converter_class not in (
115-
DefaultPayloadConverter,
116-
PydanticPayloadConverter,
117-
):
118-
warnings.warn( # pragma: no cover
119-
'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.'
120-
)
121-
122-
return pydantic_data_converter
123-
124-
125-
class AgentPlugin(WorkerPlugin):
126-
"""Temporal worker plugin for a specific Pydantic AI agent."""
127-
128-
def __init__(self, agent: TemporalAgent[Any, Any]):
129-
self.agent = agent
130-
131-
def init_worker_plugin(self, next: WorkerPlugin) -> None:
132-
self.next_worker_plugin = next
85+
def __init__(self):
86+
super().__init__( # type: ignore[reportUnknownMemberType]
87+
name='PydanticAIPlugin',
88+
data_converter=_data_converter,
89+
workflow_runner=_workflow_runner,
90+
workflow_failure_exception_types=[UserError, PydanticUserError],
91+
)
13392

134-
def configure_worker(self, config: WorkerConfig) -> WorkerConfig:
135-
activities: Sequence[Callable[..., Any]] = config.get('activities', []) # pyright: ignore[reportUnknownMemberType]
136-
# Activities are checked for name conflicts by Temporal.
137-
config['activities'] = [*activities, *self.agent.temporal_activities]
138-
return self.next_worker_plugin.configure_worker(config)
13993

140-
async def run_worker(self, worker: Worker) -> None:
141-
await self.next_worker_plugin.run_worker(worker)
142-
143-
def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover
144-
return self.next_worker_plugin.configure_replayer(config)
94+
class AgentPlugin(SimplePlugin):
95+
"""Temporal worker plugin for a specific Pydantic AI agent."""
14596

146-
def run_replayer(
147-
self,
148-
replayer: Replayer,
149-
histories: AsyncIterator[WorkflowHistory],
150-
) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover
151-
return self.next_worker_plugin.run_replayer(replayer, histories)
97+
def __init__(self, agent: TemporalAgent[Any, Any]):
98+
super().__init__( # type: ignore[reportUnknownMemberType]
99+
name='AgentPlugin',
100+
activities=agent.temporal_activities,
101+
)

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ async def _call_event_stream_handler_activity(
219219
) -> None:
220220
serialized_run_context = self.run_context_type.serialize_run_context(ctx)
221221
async for event in stream:
222-
await workflow.execute_activity( # pyright: ignore[reportUnknownMemberType]
222+
await workflow.execute_activity(
223223
activity=self.event_stream_handler_activity,
224224
args=[
225225
_EventStreamHandlerParams(

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_function_toolset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ async def call_tool(
8181
tool_activity_config = self.activity_config | tool_activity_config
8282
serialized_run_context = self.run_context_type.serialize_run_context(ctx)
8383
return self._unwrap_call_tool_result(
84-
await workflow.execute_activity( # pyright: ignore[reportUnknownMemberType]
84+
await workflow.execute_activity(
8585
activity=self.call_tool_activity,
8686
args=[
8787
CallToolParams(

0 commit comments

Comments
 (0)