diff --git a/docs/handoffs.md b/docs/handoffs.md index 85707c6b3..8a9d1f1b3 100644 --- a/docs/handoffs.md +++ b/docs/handoffs.md @@ -82,6 +82,8 @@ handoff_obj = handoff( When a handoff occurs, it's as though the new agent takes over the conversation, and gets to see the entire previous conversation history. If you want to change this, you can set an [`input_filter`][agents.handoffs.Handoff.input_filter]. An input filter is a function that receives the existing input via a [`HandoffInputData`][agents.handoffs.HandoffInputData], and must return a new `HandoffInputData`. +By default the runner now collapses the prior transcript into a single assistant summary message (see [`RunConfig.nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]). The summary appears inside a `` block that keeps appending new turns when multiple handoffs happen during the same run. You can provide your own mapping function via [`RunConfig.handoff_history_mapper`][agents.run.RunConfig.handoff_history_mapper] to replace the generated message without writing a full `input_filter`. That default only applies when neither the handoff nor the run supplies an explicit `input_filter`, so existing code that already customizes the payload (including the examples in this repository) keeps its current behavior without changes. You can override the nesting behaviour for a single handoff by passing `nest_handoff_history=True` or `False` to [`handoff(...)`][agents.handoffs.handoff], which sets [`Handoff.nest_handoff_history`][agents.handoffs.Handoff.nest_handoff_history]. If you just need to change the wrapper text for the generated summary, call [`set_conversation_history_wrappers`][agents.handoffs.set_conversation_history_wrappers] (and optionally [`reset_conversation_history_wrappers`][agents.handoffs.reset_conversation_history_wrappers]) before running your agents. + There are some common patterns (for example removing all tool calls from the history), which are implemented for you in [`agents.extensions.handoff_filters`][] ```python diff --git a/docs/release.md b/docs/release.md index 95a4f67a4..93161a7c8 100644 --- a/docs/release.md +++ b/docs/release.md @@ -25,6 +25,9 @@ This version doesn’t introduce any visible breaking changes, but it includes n - Added support for `RealtimeRunner` to handle [SIP protocol connections](https://platform.openai.com/docs/guides/realtime-sip) - Significantly revised the internal logic of `Runner#run_sync` for Python 3.14 compatibility +- By default handoff history is now packaged into a single assistant message instead of exposing the raw user/assistant turns, giving downstream agents a concise, predictable recap +- The existing single-message handoff transcript now starts with "For context, here is the conversation so far between the user and the previous agent:" before the `` block, so downstream agents get a clearly labeled recap +- The existing single-message handoff transcript now starts with "For context, here is the conversation so far between the user and the previous agent:" before the `` block, so downstream agents get a clearly labeled recap ### 0.4.0 diff --git a/docs/running_agents.md b/docs/running_agents.md index ab69d8463..fb3e9aa47 100644 --- a/docs/running_agents.md +++ b/docs/running_agents.md @@ -51,11 +51,15 @@ The `run_config` parameter lets you configure some global settings for the agent - [`model_settings`][agents.run.RunConfig.model_settings]: Overrides agent-specific settings. For example, you can set a global `temperature` or `top_p`. - [`input_guardrails`][agents.run.RunConfig.input_guardrails], [`output_guardrails`][agents.run.RunConfig.output_guardrails]: A list of input or output guardrails to include on all runs. - [`handoff_input_filter`][agents.run.RunConfig.handoff_input_filter]: A global input filter to apply to all handoffs, if the handoff doesn't already have one. The input filter allows you to edit the inputs that are sent to the new agent. See the documentation in [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] for more details. +- [`nest_handoff_history`][agents.run.RunConfig.nest_handoff_history]: When `True` (the default) the runner collapses the prior transcript into a single assistant message before invoking the next agent. The helper places the content inside a `` block that keeps appending new turns as subsequent handoffs occur. Set this to `False` or provide a custom handoff filter if you prefer to pass through the raw transcript. All [`Runner` methods](agents.run.Runner) automatically create a `RunConfig` when you do not pass one, so the quickstarts and examples pick up this default automatically, and any explicit [`Handoff.input_filter`][agents.handoffs.Handoff.input_filter] callbacks continue to override it. Individual handoffs can override this setting via [`Handoff.nest_handoff_history`][agents.handoffs.Handoff.nest_handoff_history]. +- [`handoff_history_mapper`][agents.run.RunConfig.handoff_history_mapper]: Optional callable that receives the normalized transcript (history + handoff items) whenever `nest_handoff_history` is `True`. It must return the exact list of input items to forward to the next agent, allowing you to replace the built-in summary without writing a full handoff filter. - [`tracing_disabled`][agents.run.RunConfig.tracing_disabled]: Allows you to disable [tracing](tracing.md) for the entire run. - [`trace_include_sensitive_data`][agents.run.RunConfig.trace_include_sensitive_data]: Configures whether traces will include potentially sensitive data, such as LLM and tool call inputs/outputs. - [`workflow_name`][agents.run.RunConfig.workflow_name], [`trace_id`][agents.run.RunConfig.trace_id], [`group_id`][agents.run.RunConfig.group_id]: Sets the tracing workflow name, trace ID and trace group ID for the run. We recommend at least setting `workflow_name`. The group ID is an optional field that lets you link traces across multiple runs. - [`trace_metadata`][agents.run.RunConfig.trace_metadata]: Metadata to include on all traces. +By default, the SDK now nests prior turns inside a single assistant summary message whenever an agent hands off to another agent. This reduces repeated assistant messages and keeps the full transcript inside a single block that new agents can scan quickly. If you'd like to return to the legacy behavior, pass `RunConfig(nest_handoff_history=False)` or supply a `handoff_input_filter` (or `handoff_history_mapper`) that forwards the conversation exactly as you need. You can also opt out (or in) for a specific handoff by setting `handoff(..., nest_handoff_history=False)` or `True`. To change the wrapper text used in the generated summary without writing a custom mapper, call [`set_conversation_history_wrappers`][agents.handoffs.set_conversation_history_wrappers] (and [`reset_conversation_history_wrappers`][agents.handoffs.reset_conversation_history_wrappers] to restore the defaults). + ## Conversations/chat threads Calling any of the run methods can result in one or more agents running (and hence one or more LLM calls), but it represents a single logical turn in a chat conversation. For example: @@ -200,4 +204,4 @@ The SDK raises exceptions in certain cases. The full list is in [`agents.excepti - Malformed JSON: When the model provides a malformed JSON structure for tool calls or in its direct output, especially if a specific `output_type` is defined. - Unexpected tool-related failures: When the model fails to use tools in an expected manner - [`UserError`][agents.exceptions.UserError]: This exception is raised when you (the person writing code using the SDK) make an error while using the SDK. This typically results from incorrect code implementation, invalid configuration, or misuse of the SDK's API. -- [`InputGuardrailTripwireTriggered`][agents.exceptions.InputGuardrailTripwireTriggered], [`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered]: This exception is raised when the conditions of an input guardrail or output guardrail are met, respectively. Input guardrails check incoming messages before processing, while output guardrails check the agent's final response before delivery. \ No newline at end of file +- [`InputGuardrailTripwireTriggered`][agents.exceptions.InputGuardrailTripwireTriggered], [`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered]: This exception is raised when the conditions of an input guardrail or output guardrail are met, respectively. Input guardrails check incoming messages before processing, while output guardrails check the agent's final response before delivery. diff --git a/src/agents/__init__.py b/src/agents/__init__.py index b285d6f8c..8488cd540 100644 --- a/src/agents/__init__.py +++ b/src/agents/__init__.py @@ -34,7 +34,17 @@ input_guardrail, output_guardrail, ) -from .handoffs import Handoff, HandoffInputData, HandoffInputFilter, handoff +from .handoffs import ( + Handoff, + HandoffInputData, + HandoffInputFilter, + default_handoff_history_mapper, + get_conversation_history_wrappers, + handoff, + nest_handoff_history, + reset_conversation_history_wrappers, + set_conversation_history_wrappers, +) from .items import ( HandoffCallItem, HandoffOutputItem, @@ -191,6 +201,11 @@ def enable_verbose_stdout_logging(): "StopAtTools", "ToolsToFinalOutputFunction", "ToolsToFinalOutputResult", + "default_handoff_history_mapper", + "get_conversation_history_wrappers", + "nest_handoff_history", + "reset_conversation_history_wrappers", + "set_conversation_history_wrappers", "Runner", "run_demo_loop", "Model", diff --git a/src/agents/_run_impl.py b/src/agents/_run_impl.py index 88a770a56..8be6fbb0d 100644 --- a/src/agents/_run_impl.py +++ b/src/agents/_run_impl.py @@ -52,7 +52,7 @@ UserError, ) from .guardrail import InputGuardrail, InputGuardrailResult, OutputGuardrail, OutputGuardrailResult -from .handoffs import Handoff, HandoffInputData +from .handoffs import Handoff, HandoffInputData, nest_handoff_history from .items import ( HandoffCallItem, HandoffOutputItem, @@ -998,8 +998,14 @@ async def execute_handoffs( input_filter = handoff.input_filter or ( run_config.handoff_input_filter if run_config else None ) - if input_filter: - logger.debug("Filtering inputs for handoff") + handoff_nest_setting = handoff.nest_handoff_history + should_nest_history = ( + handoff_nest_setting + if handoff_nest_setting is not None + else run_config.nest_handoff_history + ) + handoff_input_data: HandoffInputData | None = None + if input_filter or should_nest_history: handoff_input_data = HandoffInputData( input_history=tuple(original_input) if isinstance(original_input, list) @@ -1008,6 +1014,17 @@ async def execute_handoffs( new_items=tuple(new_step_items), run_context=context_wrapper, ) + + if input_filter and handoff_input_data is not None: + filter_name = getattr(input_filter, "__qualname__", repr(input_filter)) + from_agent = getattr(agent, "name", agent.__class__.__name__) + to_agent = getattr(new_agent, "name", new_agent.__class__.__name__) + logger.debug( + "Filtering handoff inputs with %s for %s -> %s", + filter_name, + from_agent, + to_agent, + ) if not callable(input_filter): _error_tracing.attach_error_to_span( span_handoff, @@ -1037,6 +1054,18 @@ async def execute_handoffs( ) pre_step_items = list(filtered.pre_handoff_items) new_step_items = list(filtered.new_items) + elif should_nest_history and handoff_input_data is not None: + nested = nest_handoff_history( + handoff_input_data, + history_mapper=run_config.handoff_history_mapper, + ) + original_input = ( + nested.input_history + if isinstance(nested.input_history, str) + else list(nested.input_history) + ) + pre_step_items = list(nested.pre_handoff_items) + new_step_items = list(nested.new_items) return SingleStepResult( original_input=original_input, diff --git a/src/agents/extensions/handoff_filters.py b/src/agents/extensions/handoff_filters.py index a4433ae0c..85d68c1d8 100644 --- a/src/agents/extensions/handoff_filters.py +++ b/src/agents/extensions/handoff_filters.py @@ -1,6 +1,10 @@ from __future__ import annotations -from ..handoffs import HandoffInputData +from ..handoffs import ( + HandoffInputData, + default_handoff_history_mapper, + nest_handoff_history, +) from ..items import ( HandoffCallItem, HandoffOutputItem, @@ -13,6 +17,12 @@ """Contains common handoff input filters, for convenience. """ +__all__ = [ + "remove_all_tools", + "nest_handoff_history", + "default_handoff_history_mapper", +] + def remove_all_tools(handoff_input_data: HandoffInputData) -> HandoffInputData: """Filters out all tool items: file search, web search and function calls+output.""" diff --git a/src/agents/handoffs.py b/src/agents/handoffs/__init__.py similarity index 53% rename from src/agents/handoffs.py rename to src/agents/handoffs/__init__.py index 2c52737ad..8db14659f 100644 --- a/src/agents/handoffs.py +++ b/src/agents/handoffs/__init__.py @@ -9,22 +9,26 @@ from pydantic import TypeAdapter from typing_extensions import TypeAlias, TypeVar -from .exceptions import ModelBehaviorError, UserError -from .items import RunItem, TResponseInputItem -from .run_context import RunContextWrapper, TContext -from .strict_schema import ensure_strict_json_schema -from .tracing.spans import SpanError -from .util import _error_tracing, _json, _transforms -from .util._types import MaybeAwaitable +from ..exceptions import ModelBehaviorError, UserError +from ..items import RunItem, TResponseInputItem +from ..run_context import RunContextWrapper, TContext +from ..strict_schema import ensure_strict_json_schema +from ..tracing.spans import SpanError +from ..util import _error_tracing, _json, _transforms +from ..util._types import MaybeAwaitable +from .history import ( + default_handoff_history_mapper, + get_conversation_history_wrappers, + nest_handoff_history, + reset_conversation_history_wrappers, + set_conversation_history_wrappers, +) if TYPE_CHECKING: - from .agent import Agent, AgentBase + from ..agent import Agent, AgentBase -# The handoff input type is the type of data passed when the agent is called via a handoff. THandoffInput = TypeVar("THandoffInput", default=Any) - -# The agent type that the handoff returns TAgent = TypeVar("TAgent", bound="AgentBase[Any]", default="Agent[Any]") OnHandoffWithInput = Callable[[RunContextWrapper[Any], THandoffInput], Any] @@ -34,97 +38,31 @@ @dataclass(frozen=True) class HandoffInputData: input_history: str | tuple[TResponseInputItem, ...] - """ - The input history before `Runner.run()` was called. - """ - pre_handoff_items: tuple[RunItem, ...] - """ - The items generated before the agent turn where the handoff was invoked. - """ - new_items: tuple[RunItem, ...] - """ - The new items generated during the current agent turn, including the item that triggered the - handoff and the tool output message representing the response from the handoff output. - """ - run_context: RunContextWrapper[Any] | None = None - """ - The run context at the time the handoff was invoked. - Note that, since this property was added later on, it's optional for backwards compatibility. - """ def clone(self, **kwargs: Any) -> HandoffInputData: - """ - Make a copy of the handoff input data, with the given arguments changed. For example, you - could do: - ``` - new_handoff_input_data = handoff_input_data.clone(new_items=()) - ``` - """ return dataclasses_replace(self, **kwargs) HandoffInputFilter: TypeAlias = Callable[[HandoffInputData], MaybeAwaitable[HandoffInputData]] -"""A function that filters the input data passed to the next agent.""" +HandoffHistoryMapper: TypeAlias = Callable[[list[TResponseInputItem]], list[TResponseInputItem]] @dataclass class Handoff(Generic[TContext, TAgent]): - """A handoff is when an agent delegates a task to another agent. - For example, in a customer support scenario you might have a "triage agent" that determines - which agent should handle the user's request, and sub-agents that specialize in different - areas like billing, account management, etc. - """ - tool_name: str - """The name of the tool that represents the handoff.""" - tool_description: str - """The description of the tool that represents the handoff.""" - input_json_schema: dict[str, Any] - """The JSON schema for the handoff input. Can be empty if the handoff does not take an input. - """ - on_invoke_handoff: Callable[[RunContextWrapper[Any], str], Awaitable[TAgent]] - """The function that invokes the handoff. The parameters passed are: - 1. The handoff run context - 2. The arguments from the LLM, as a JSON string. Empty string if input_json_schema is empty. - - Must return an agent. - """ - agent_name: str - """The name of the agent that is being handed off to.""" - input_filter: HandoffInputFilter | None = None - """A function that filters the inputs that are passed to the next agent. By default, the new - agent sees the entire conversation history. In some cases, you may want to filter inputs e.g. - to remove older inputs, or remove tools from existing inputs. - - The function will receive the entire conversation history so far, including the input item - that triggered the handoff and a tool call output item representing the handoff tool's output. - - You are free to modify the input history or new items as you see fit. The next agent that - runs will receive `handoff_input_data.all_items`. - - IMPORTANT: in streaming mode, we will not stream anything as a result of this function. The - items generated before will already have been streamed. - """ - + nest_handoff_history: bool | None = None strict_json_schema: bool = True - """Whether the input JSON schema is in strict mode. We **strongly** recommend setting this to - True, as it increases the likelihood of correct JSON input. - """ - is_enabled: bool | Callable[[RunContextWrapper[Any], AgentBase[Any]], MaybeAwaitable[bool]] = ( True ) - """Whether the handoff is enabled. Either a bool or a Callable that takes the run context and - agent and returns whether the handoff is enabled. You can use this to dynamically enable/disable - a handoff based on your context/state.""" def get_transfer_message(self, agent: AgentBase[Any]) -> str: return json.dumps({"assistant": agent.name}) @@ -143,65 +81,57 @@ def default_tool_description(cls, agent: AgentBase[Any]) -> str: @overload def handoff( - agent: Agent[TContext], + agent: "Agent[TContext]", *, tool_name_override: str | None = None, tool_description_override: str | None = None, input_filter: Callable[[HandoffInputData], HandoffInputData] | None = None, - is_enabled: bool | Callable[[RunContextWrapper[Any], Agent[Any]], MaybeAwaitable[bool]] = True, -) -> Handoff[TContext, Agent[TContext]]: ... + nest_handoff_history: bool | None = None, + is_enabled: bool + | Callable[[RunContextWrapper[Any], "Agent[Any]"], MaybeAwaitable[bool]] = True, +) -> Handoff[TContext, "Agent[TContext]"]: ... @overload def handoff( - agent: Agent[TContext], + agent: "Agent[TContext]", *, on_handoff: OnHandoffWithInput[THandoffInput], input_type: type[THandoffInput], tool_description_override: str | None = None, tool_name_override: str | None = None, input_filter: Callable[[HandoffInputData], HandoffInputData] | None = None, - is_enabled: bool | Callable[[RunContextWrapper[Any], Agent[Any]], MaybeAwaitable[bool]] = True, -) -> Handoff[TContext, Agent[TContext]]: ... + nest_handoff_history: bool | None = None, + is_enabled: bool + | Callable[[RunContextWrapper[Any], "Agent[Any]"], MaybeAwaitable[bool]] = True, +) -> Handoff[TContext, "Agent[TContext]"]: ... @overload def handoff( - agent: Agent[TContext], + agent: "Agent[TContext]", *, on_handoff: OnHandoffWithoutInput, tool_description_override: str | None = None, tool_name_override: str | None = None, input_filter: Callable[[HandoffInputData], HandoffInputData] | None = None, - is_enabled: bool | Callable[[RunContextWrapper[Any], Agent[Any]], MaybeAwaitable[bool]] = True, -) -> Handoff[TContext, Agent[TContext]]: ... + nest_handoff_history: bool | None = None, + is_enabled: bool + | Callable[[RunContextWrapper[Any], "Agent[Any]"], MaybeAwaitable[bool]] = True, +) -> Handoff[TContext, "Agent[TContext]"]: ... def handoff( - agent: Agent[TContext], + agent: "Agent[TContext]", tool_name_override: str | None = None, tool_description_override: str | None = None, on_handoff: OnHandoffWithInput[THandoffInput] | OnHandoffWithoutInput | None = None, input_type: type[THandoffInput] | None = None, input_filter: Callable[[HandoffInputData], HandoffInputData] | None = None, + nest_handoff_history: bool | None = None, is_enabled: bool - | Callable[[RunContextWrapper[Any], Agent[TContext]], MaybeAwaitable[bool]] = True, -) -> Handoff[TContext, Agent[TContext]]: - """Create a handoff from an agent. - - Args: - agent: The agent to handoff to, or a function that returns an agent. - tool_name_override: Optional override for the name of the tool that represents the handoff. - tool_description_override: Optional override for the description of the tool that - represents the handoff. - on_handoff: A function that runs when the handoff is invoked. - input_type: the type of the input to the handoff. If provided, the input will be validated - against this type. Only relevant if you pass a function that takes an input. - input_filter: a function that filters the inputs that are passed to the next agent. - is_enabled: Whether the handoff is enabled. Can be a bool or a callable that takes the run - context and agent and returns whether the handoff is enabled. Disabled handoffs are - hidden from the LLM at runtime. - """ + | Callable[[RunContextWrapper[Any], "Agent[TContext]"], MaybeAwaitable[bool]] = True, +) -> Handoff[TContext, "Agent[TContext]"]: assert (on_handoff and input_type) or not (on_handoff and input_type), ( "You must provide either both on_handoff and input_type, or neither" ) @@ -224,7 +154,7 @@ def handoff( async def _invoke_handoff( ctx: RunContextWrapper[Any], input_json: str | None = None - ) -> Agent[TContext]: + ) -> "Agent[TContext]": if input_type is not None and type_adapter is not None: if input_json is None: _error_tracing.attach_error_to_current_span( @@ -256,22 +186,17 @@ async def _invoke_handoff( tool_name = tool_name_override or Handoff.default_tool_name(agent) tool_description = tool_description_override or Handoff.default_tool_description(agent) - - # Always ensure the input JSON schema is in strict mode - # If there is a need, we can make this configurable in the future input_json_schema = ensure_strict_json_schema(input_json_schema) - async def _is_enabled(ctx: RunContextWrapper[Any], agent_base: AgentBase[Any]) -> bool: - from .agent import Agent + async def _is_enabled(ctx: RunContextWrapper[Any], agent_base: "AgentBase[Any]") -> bool: + from ..agent import Agent assert callable(is_enabled), "is_enabled must be callable here" assert isinstance(agent_base, Agent), "Can't handoff to a non-Agent" result = is_enabled(ctx, agent_base) - if inspect.isawaitable(result): return await result - - return result + return bool(result) return Handoff( tool_name=tool_name, @@ -279,6 +204,21 @@ async def _is_enabled(ctx: RunContextWrapper[Any], agent_base: AgentBase[Any]) - input_json_schema=input_json_schema, on_invoke_handoff=_invoke_handoff, input_filter=input_filter, + nest_handoff_history=nest_handoff_history, agent_name=agent.name, is_enabled=_is_enabled if callable(is_enabled) else is_enabled, ) + + +__all__ = [ + "Handoff", + "HandoffHistoryMapper", + "HandoffInputData", + "HandoffInputFilter", + "default_handoff_history_mapper", + "get_conversation_history_wrappers", + "handoff", + "nest_handoff_history", + "reset_conversation_history_wrappers", + "set_conversation_history_wrappers", +] diff --git a/src/agents/handoffs/history.py b/src/agents/handoffs/history.py new file mode 100644 index 000000000..8469ddae4 --- /dev/null +++ b/src/agents/handoffs/history.py @@ -0,0 +1,241 @@ +from __future__ import annotations + +import json +from copy import deepcopy +from typing import TYPE_CHECKING, Any, cast + +from ..items import ( + HandoffCallItem, + HandoffOutputItem, + ItemHelpers, + ReasoningItem, + RunItem, + ToolCallItem, + ToolCallOutputItem, + TResponseInputItem, +) + +if TYPE_CHECKING: + from . import HandoffHistoryMapper, HandoffInputData + +__all__ = [ + "default_handoff_history_mapper", + "get_conversation_history_wrappers", + "nest_handoff_history", + "reset_conversation_history_wrappers", + "set_conversation_history_wrappers", +] + +_DEFAULT_CONVERSATION_HISTORY_START = "" +_DEFAULT_CONVERSATION_HISTORY_END = "" +_conversation_history_start = _DEFAULT_CONVERSATION_HISTORY_START +_conversation_history_end = _DEFAULT_CONVERSATION_HISTORY_END + + +def set_conversation_history_wrappers( + *, + start: str | None = None, + end: str | None = None, +) -> None: + """Override the markers that wrap the generated conversation summary. + + Pass ``None`` to leave either side unchanged. + """ + + global _conversation_history_start, _conversation_history_end + if start is not None: + _conversation_history_start = start + if end is not None: + _conversation_history_end = end + + +def reset_conversation_history_wrappers() -> None: + """Restore the default ```` markers.""" + + global _conversation_history_start, _conversation_history_end + _conversation_history_start = _DEFAULT_CONVERSATION_HISTORY_START + _conversation_history_end = _DEFAULT_CONVERSATION_HISTORY_END + + +def get_conversation_history_wrappers() -> tuple[str, str]: + """Return the current start/end markers used for the nested conversation summary.""" + + return (_conversation_history_start, _conversation_history_end) + + +def nest_handoff_history( + handoff_input_data: HandoffInputData, + *, + history_mapper: HandoffHistoryMapper | None = None, +) -> HandoffInputData: + """Summarize the previous transcript for the next agent.""" + + normalized_history = _normalize_input_history(handoff_input_data.input_history) + flattened_history = _flatten_nested_history_messages(normalized_history) + pre_items_as_inputs = [ + _run_item_to_plain_input(item) for item in handoff_input_data.pre_handoff_items + ] + new_items_as_inputs = [_run_item_to_plain_input(item) for item in handoff_input_data.new_items] + transcript = flattened_history + pre_items_as_inputs + new_items_as_inputs + + mapper = history_mapper or default_handoff_history_mapper + history_items = mapper(transcript) + filtered_pre_items = tuple( + item + for item in handoff_input_data.pre_handoff_items + if _get_run_item_role(item) != "assistant" + ) + + return handoff_input_data.clone( + input_history=tuple(deepcopy(item) for item in history_items), + pre_handoff_items=filtered_pre_items, + ) + + +def default_handoff_history_mapper( + transcript: list[TResponseInputItem], +) -> list[TResponseInputItem]: + """Return a single assistant message summarizing the transcript.""" + + summary_message = _build_summary_message(transcript) + return [summary_message] + + +def _normalize_input_history( + input_history: str | tuple[TResponseInputItem, ...], +) -> list[TResponseInputItem]: + if isinstance(input_history, str): + return ItemHelpers.input_to_new_input_list(input_history) + return [deepcopy(item) for item in input_history] + + +def _run_item_to_plain_input(run_item: RunItem) -> TResponseInputItem: + return deepcopy(run_item.to_input_item()) + + +def _build_summary_message(transcript: list[TResponseInputItem]) -> TResponseInputItem: + transcript_copy = [deepcopy(item) for item in transcript] + if transcript_copy: + summary_lines = [ + f"{idx + 1}. {_format_transcript_item(item)}" + for idx, item in enumerate(transcript_copy) + ] + else: + summary_lines = ["(no previous turns recorded)"] + + start_marker, end_marker = get_conversation_history_wrappers() + content_lines = [ + "For context, here is the conversation so far between the user and the previous agent:", + start_marker, + *summary_lines, + end_marker, + ] + content = "\n".join(content_lines) + assistant_message: dict[str, Any] = { + "role": "assistant", + "content": content, + } + return cast(TResponseInputItem, assistant_message) + + +def _format_transcript_item(item: TResponseInputItem) -> str: + role = item.get("role") + if isinstance(role, str): + prefix = role + name = item.get("name") + if isinstance(name, str) and name: + prefix = f"{prefix} ({name})" + content_str = _stringify_content(item.get("content")) + return f"{prefix}: {content_str}" if content_str else prefix + + item_type = item.get("type", "item") + rest = {k: v for k, v in item.items() if k != "type"} + try: + serialized = json.dumps(rest, ensure_ascii=False, default=str) + except TypeError: + serialized = str(rest) + return f"{item_type}: {serialized}" if serialized else str(item_type) + + +def _stringify_content(content: Any) -> str: + if content is None: + return "" + if isinstance(content, str): + return content + try: + return json.dumps(content, ensure_ascii=False, default=str) + except TypeError: + return str(content) + + +def _flatten_nested_history_messages( + items: list[TResponseInputItem], +) -> list[TResponseInputItem]: + flattened: list[TResponseInputItem] = [] + for item in items: + nested_transcript = _extract_nested_history_transcript(item) + if nested_transcript is not None: + flattened.extend(nested_transcript) + continue + flattened.append(deepcopy(item)) + return flattened + + +def _extract_nested_history_transcript( + item: TResponseInputItem, +) -> list[TResponseInputItem] | None: + content = item.get("content") + if not isinstance(content, str): + return None + start_marker, end_marker = get_conversation_history_wrappers() + start_idx = content.find(start_marker) + end_idx = content.find(end_marker) + if start_idx == -1 or end_idx == -1 or end_idx <= start_idx: + return None + start_idx += len(start_marker) + body = content[start_idx:end_idx] + lines = [line.strip() for line in body.splitlines() if line.strip()] + parsed: list[TResponseInputItem] = [] + for line in lines: + parsed_item = _parse_summary_line(line) + if parsed_item is not None: + parsed.append(parsed_item) + return parsed + + +def _parse_summary_line(line: str) -> TResponseInputItem | None: + stripped = line.strip() + if not stripped: + return None + dot_index = stripped.find(".") + if dot_index != -1 and stripped[:dot_index].isdigit(): + stripped = stripped[dot_index + 1 :].lstrip() + role_part, sep, remainder = stripped.partition(":") + if not sep: + return None + role_text = role_part.strip() + if not role_text: + return None + role, name = _split_role_and_name(role_text) + reconstructed: dict[str, Any] = {"role": role} + if name: + reconstructed["name"] = name + content = remainder.strip() + if content: + reconstructed["content"] = content + return cast(TResponseInputItem, reconstructed) + + +def _split_role_and_name(role_text: str) -> tuple[str, str | None]: + if role_text.endswith(")") and "(" in role_text: + open_idx = role_text.rfind("(") + possible_name = role_text[open_idx + 1 : -1].strip() + role_candidate = role_text[:open_idx].strip() + if possible_name: + return (role_candidate or "developer", possible_name) + return (role_text or "developer", None) + + +def _get_run_item_role(run_item: RunItem) -> str | None: + role_candidate = run_item.to_input_item().get("role") + return role_candidate if isinstance(role_candidate, str) else None diff --git a/src/agents/run.py b/src/agents/run.py index 5b25df4f2..237bd8198 100644 --- a/src/agents/run.py +++ b/src/agents/run.py @@ -46,7 +46,7 @@ OutputGuardrail, OutputGuardrailResult, ) -from .handoffs import Handoff, HandoffInputFilter, handoff +from .handoffs import Handoff, HandoffHistoryMapper, HandoffInputFilter, handoff from .items import ( HandoffCallItem, ItemHelpers, @@ -198,6 +198,19 @@ class RunConfig: agent. See the documentation in `Handoff.input_filter` for more details. """ + nest_handoff_history: bool = True + """Wrap prior run history in a single assistant message before handing off when no custom + input filter is set. Set to False to preserve the raw transcript behavior from previous + releases. + """ + + handoff_history_mapper: HandoffHistoryMapper | None = None + """Optional function that receives the normalized transcript (history + handoff items) and + returns the input history that should be passed to the next agent. When left as `None`, the + runner collapses the transcript into a single assistant message. This function only runs when + `nest_handoff_history` is True. + """ + input_guardrails: list[InputGuardrail[Any]] | None = None """A list of input guardrails to run on the initial run input.""" diff --git a/tests/test_agent_runner.py b/tests/test_agent_runner.py index ceac6c50f..d05496e50 100644 --- a/tests/test_agent_runner.py +++ b/tests/test_agent_runner.py @@ -43,6 +43,14 @@ from .utils.simple_session import SimpleListSession +def _as_message(item: Any) -> dict[str, Any]: + assert isinstance(item, dict) + role = item.get("role") + assert isinstance(role, str) + assert role in {"assistant", "user", "system", "developer"} + return cast(dict[str, Any], item) + + @pytest.mark.asyncio async def test_simple_first_run(): model = FakeModel() @@ -165,8 +173,8 @@ async def test_handoffs(): assert result.final_output == "done" assert len(result.raw_responses) == 3, "should have three model responses" assert len(result.to_input_list()) == 7, ( - "should have 7 inputs: orig input, tool call, tool result, message, handoff, handoff" - "result, and done message" + "should have 7 inputs: summary message, tool call, tool result, message, handoff, " + "handoff result, and done message" ) assert result.last_agent == agent_1, "should have handed off to agent_1" @@ -218,9 +226,9 @@ async def test_structured_output(): assert result.final_output == Foo(bar="baz") assert len(result.raw_responses) == 4, "should have four model responses" - assert len(result.to_input_list()) == 11, ( - "should have input: 2 orig inputs, function call, function call result, message, handoff, " - "handoff output, preamble message, tool call, tool call result, final output" + assert len(result.to_input_list()) == 10, ( + "should have input: conversation summary, function call, function call result, message, " + "handoff, handoff output, preamble message, tool call, tool call result, final output" ) assert result.last_agent == agent_1, "should have handed off to agent_1" @@ -270,6 +278,97 @@ async def test_handoff_filters(): ) +@pytest.mark.asyncio +async def test_default_handoff_history_nested_and_filters_respected(): + model = FakeModel() + agent_1 = Agent( + name="delegate", + model=model, + ) + agent_2 = Agent( + name="triage", + model=model, + handoffs=[agent_1], + ) + + model.add_multiple_turn_outputs( + [ + [get_text_message("triage summary"), get_handoff_tool_call(agent_1)], + [get_text_message("resolution")], + ] + ) + + result = await Runner.run(agent_2, input="user_message") + + assert isinstance(result.input, list) + assert len(result.input) == 1 + summary = _as_message(result.input[0]) + assert summary["role"] == "assistant" + summary_content = summary["content"] + assert isinstance(summary_content, str) + assert "" in summary_content + assert "triage summary" in summary_content + assert "user_message" in summary_content + + passthrough_model = FakeModel() + delegate = Agent(name="delegate", model=passthrough_model) + + def passthrough_filter(data: HandoffInputData) -> HandoffInputData: + return data + + triage_with_filter = Agent( + name="triage", + model=passthrough_model, + handoffs=[handoff(delegate, input_filter=passthrough_filter)], + ) + + passthrough_model.add_multiple_turn_outputs( + [ + [get_text_message("triage summary"), get_handoff_tool_call(delegate)], + [get_text_message("resolution")], + ] + ) + + filtered_result = await Runner.run(triage_with_filter, input="user_message") + + assert isinstance(filtered_result.input, str) + assert filtered_result.input == "user_message" + + +@pytest.mark.asyncio +async def test_default_handoff_history_accumulates_across_multiple_handoffs(): + triage_model = FakeModel() + delegate_model = FakeModel() + closer_model = FakeModel() + + closer = Agent(name="closer", model=closer_model) + delegate = Agent(name="delegate", model=delegate_model, handoffs=[closer]) + triage = Agent(name="triage", model=triage_model, handoffs=[delegate]) + + triage_model.add_multiple_turn_outputs( + [[get_text_message("triage summary"), get_handoff_tool_call(delegate)]] + ) + delegate_model.add_multiple_turn_outputs( + [[get_text_message("delegate update"), get_handoff_tool_call(closer)]] + ) + closer_model.add_multiple_turn_outputs([[get_text_message("resolution")]]) + + result = await Runner.run(triage, input="user_question") + + assert result.final_output == "resolution" + assert closer_model.first_turn_args is not None + closer_input = closer_model.first_turn_args["input"] + assert isinstance(closer_input, list) + summary = _as_message(closer_input[0]) + assert summary["role"] == "assistant" + summary_content = summary["content"] + assert isinstance(summary_content, str) + assert summary_content.count("") == 1 + assert "triage summary" in summary_content + assert "delegate update" in summary_content + assert "user_question" in summary_content + + @pytest.mark.asyncio async def test_async_input_filter_supported(): # DO NOT rename this without updating pyproject.toml diff --git a/tests/test_agent_runner_streamed.py b/tests/test_agent_runner_streamed.py index eca23464b..222afda78 100644 --- a/tests/test_agent_runner_streamed.py +++ b/tests/test_agent_runner_streamed.py @@ -176,8 +176,8 @@ async def test_handoffs(): assert result.final_output == "done" assert len(result.raw_responses) == 3, "should have three model responses" assert len(result.to_input_list()) == 7, ( - "should have 7 inputs: orig input, tool call, tool result, message, handoff, handoff" - "result, and done message" + "should have 7 inputs: summary message, tool call, tool result, message, handoff, " + "handoff result, and done message" ) assert result.last_agent == agent_1, "should have handed off to agent_1" @@ -231,9 +231,9 @@ async def test_structured_output(): assert result.final_output == Foo(bar="baz") assert len(result.raw_responses) == 4, "should have four model responses" - assert len(result.to_input_list()) == 11, ( - "should have input: 2 orig inputs, function call, function call result, message, handoff, " - "handoff output, preamble message, tool call, tool call result, final output" + assert len(result.to_input_list()) == 10, ( + "should have input: conversation summary, function call, function call result, message, " + "handoff, handoff output, preamble message, tool call, tool call result, final output" ) assert result.last_agent == agent_1, "should have handed off to agent_1" @@ -717,9 +717,9 @@ async def test_streaming_events(): assert result.final_output == Foo(bar="baz") assert len(result.raw_responses) == 4, "should have four model responses" - assert len(result.to_input_list()) == 10, ( - "should have input: 2 orig inputs, function call, function call result, message, handoff, " - "handoff output, tool call, tool call result, final output" + assert len(result.to_input_list()) == 9, ( + "should have input: conversation summary, function call, function call result, message, " + "handoff, handoff output, tool call, tool call result, final output" ) assert result.last_agent == agent_1, "should have handed off to agent_1" diff --git a/tests/test_extension_filters.py b/tests/test_extension_filters.py index 0c7732995..86161bbb7 100644 --- a/tests/test_extension_filters.py +++ b/tests/test_extension_filters.py @@ -1,8 +1,18 @@ +from copy import deepcopy +from typing import Any, cast + from openai.types.responses import ResponseOutputMessage, ResponseOutputText from openai.types.responses.response_reasoning_item import ResponseReasoningItem -from agents import Agent, HandoffInputData, RunContextWrapper -from agents.extensions.handoff_filters import remove_all_tools +from agents import ( + Agent, + HandoffInputData, + RunContextWrapper, + get_conversation_history_wrappers, + reset_conversation_history_wrappers, + set_conversation_history_wrappers, +) +from agents.extensions.handoff_filters import nest_handoff_history, remove_all_tools from agents.items import ( HandoffOutputItem, MessageOutputItem, @@ -25,6 +35,13 @@ def _get_message_input_item(content: str) -> TResponseInputItem: } +def _get_user_input_item(content: str) -> TResponseInputItem: + return { + "role": "user", + "content": content, + } + + def _get_reasoning_input_item() -> TResponseInputItem: return {"id": "rid", "summary": [], "type": "reasoning"} @@ -91,6 +108,14 @@ def _get_reasoning_output_run_item() -> ReasoningItem: ) +def _as_message(item: TResponseInputItem) -> dict[str, Any]: + assert isinstance(item, dict) + role = item.get("role") + assert isinstance(role, str) + assert role in {"assistant", "user", "system", "developer"} + return cast(dict[str, Any], item) + + def test_empty_data(): handoff_input_data = HandoffInputData( input_history=(), @@ -221,3 +246,155 @@ def test_removes_handoffs_from_history(): assert len(filtered_data.input_history) == 1 assert len(filtered_data.pre_handoff_items) == 1 assert len(filtered_data.new_items) == 1 + + +def test_nest_handoff_history_wraps_transcript() -> None: + data = HandoffInputData( + input_history=(_get_user_input_item("Hello"),), + pre_handoff_items=(_get_message_output_run_item("Assist reply"),), + new_items=( + _get_message_output_run_item("Handoff request"), + _get_handoff_output_run_item("transfer"), + ), + run_context=RunContextWrapper(context=()), + ) + + nested = nest_handoff_history(data) + + assert isinstance(nested.input_history, tuple) + assert len(nested.input_history) == 1 + summary = _as_message(nested.input_history[0]) + assert summary["role"] == "assistant" + summary_content = summary["content"] + assert isinstance(summary_content, str) + start_marker, end_marker = get_conversation_history_wrappers() + assert start_marker in summary_content + assert end_marker in summary_content + assert "Assist reply" in summary_content + assert "Hello" in summary_content + assert len(nested.pre_handoff_items) == 0 + assert nested.new_items == data.new_items + + +def test_nest_handoff_history_handles_missing_user() -> None: + data = HandoffInputData( + input_history=(), + pre_handoff_items=(_get_reasoning_output_run_item(),), + new_items=(), + run_context=RunContextWrapper(context=()), + ) + + nested = nest_handoff_history(data) + + assert isinstance(nested.input_history, tuple) + assert len(nested.input_history) == 1 + summary = _as_message(nested.input_history[0]) + assert summary["role"] == "assistant" + summary_content = summary["content"] + assert isinstance(summary_content, str) + assert "reasoning" in summary_content.lower() + + +def test_nest_handoff_history_appends_existing_history() -> None: + first = HandoffInputData( + input_history=(_get_user_input_item("Hello"),), + pre_handoff_items=(_get_message_output_run_item("First reply"),), + new_items=(), + run_context=RunContextWrapper(context=()), + ) + + first_nested = nest_handoff_history(first) + assert isinstance(first_nested.input_history, tuple) + summary_message = first_nested.input_history[0] + + follow_up_history: tuple[TResponseInputItem, ...] = ( + summary_message, + _get_user_input_item("Another question"), + ) + + second = HandoffInputData( + input_history=follow_up_history, + pre_handoff_items=(_get_message_output_run_item("Second reply"),), + new_items=(_get_handoff_output_run_item("transfer"),), + run_context=RunContextWrapper(context=()), + ) + + second_nested = nest_handoff_history(second) + + assert isinstance(second_nested.input_history, tuple) + summary = _as_message(second_nested.input_history[0]) + assert summary["role"] == "assistant" + content = summary["content"] + assert isinstance(content, str) + start_marker, end_marker = get_conversation_history_wrappers() + assert content.count(start_marker) == 1 + assert content.count(end_marker) == 1 + assert "First reply" in content + assert "Second reply" in content + assert "Another question" in content + + +def test_nest_handoff_history_honors_custom_wrappers() -> None: + data = HandoffInputData( + input_history=(_get_user_input_item("Hello"),), + pre_handoff_items=(_get_message_output_run_item("First reply"),), + new_items=(_get_message_output_run_item("Second reply"),), + run_context=RunContextWrapper(context=()), + ) + + set_conversation_history_wrappers(start="<>", end="<>") + try: + nested = nest_handoff_history(data) + assert isinstance(nested.input_history, tuple) + assert len(nested.input_history) == 1 + summary = _as_message(nested.input_history[0]) + summary_content = summary["content"] + assert isinstance(summary_content, str) + lines = summary_content.splitlines() + assert lines[0] == ( + "For context, here is the conversation so far between the user and the previous agent:" + ) + assert lines[1].startswith("<>") + assert summary_content.endswith("<>") + + # Ensure the custom markers are parsed correctly when nesting again. + second_nested = nest_handoff_history(nested) + assert isinstance(second_nested.input_history, tuple) + second_summary = _as_message(second_nested.input_history[0]) + content = second_summary["content"] + assert isinstance(content, str) + assert content.count("<>") == 1 + assert content.count("<>") == 1 + finally: + reset_conversation_history_wrappers() + + +def test_nest_handoff_history_supports_custom_mapper() -> None: + data = HandoffInputData( + input_history=(_get_user_input_item("Hello"),), + pre_handoff_items=(_get_message_output_run_item("Assist reply"),), + new_items=(), + run_context=RunContextWrapper(context=()), + ) + + def map_history(items: list[TResponseInputItem]) -> list[TResponseInputItem]: + reversed_items = list(reversed(items)) + return [deepcopy(item) for item in reversed_items] + + nested = nest_handoff_history(data, history_mapper=map_history) + + assert isinstance(nested.input_history, tuple) + assert len(nested.input_history) == 2 + first = _as_message(nested.input_history[0]) + second = _as_message(nested.input_history[1]) + assert first["role"] == "assistant" + first_content = first.get("content") + assert isinstance(first_content, list) + assert any( + isinstance(chunk, dict) + and chunk.get("type") == "output_text" + and chunk.get("text") == "Assist reply" + for chunk in first_content + ) + assert second["role"] == "user" + assert second["content"] == "Hello" diff --git a/tests/test_run_step_processing.py b/tests/test_run_step_processing.py index 27d36afa8..e8bb715ad 100644 --- a/tests/test_run_step_processing.py +++ b/tests/test_run_step_processing.py @@ -1,9 +1,12 @@ from __future__ import annotations +from typing import Any, cast + import pytest from openai.types.responses import ( ResponseComputerToolCall, ResponseFileSearchToolCall, + ResponseFunctionToolCall, ResponseFunctionWebSearch, ) from openai.types.responses.response_computer_tool_call import ActionClick @@ -16,14 +19,19 @@ Computer, ComputerTool, Handoff, + HandoffInputData, ModelBehaviorError, ModelResponse, ReasoningItem, + RunConfig, RunContextWrapper, + RunItem, ToolCallItem, Usage, + handoff, ) -from agents._run_impl import RunImpl +from agents._run_impl import RunImpl, ToolRunHandoff +from agents import RunHooks from agents.run import AgentRunner from .test_responses import ( @@ -31,6 +39,7 @@ get_function_tool, get_function_tool_call, get_handoff_tool_call, + get_text_input_item, get_text_message, ) @@ -202,6 +211,100 @@ async def test_handoffs_parsed_correctly(): assert handoff_agent == agent_1 +@pytest.mark.asyncio +async def test_handoff_can_disable_run_level_history_nesting(monkeypatch: pytest.MonkeyPatch): + source_agent = Agent(name="source") + target_agent = Agent(name="target") + override_handoff = handoff(target_agent, nest_handoff_history=False) + tool_call = cast(ResponseFunctionToolCall, get_handoff_tool_call(target_agent)) + run_handoffs = [ToolRunHandoff(handoff=override_handoff, tool_call=tool_call)] + run_config = RunConfig(nest_handoff_history=True) + context_wrapper = RunContextWrapper(context=None) + hooks = RunHooks() + original_input = [get_text_input_item("hello")] + pre_step_items: list[RunItem] = [] + new_step_items: list[RunItem] = [] + new_response = ModelResponse(output=[tool_call], usage=Usage(), response_id=None) + + calls: list[HandoffInputData] = [] + + def fake_nest( + handoff_input_data: HandoffInputData, + *, + history_mapper: Any, + ) -> HandoffInputData: + calls.append(handoff_input_data) + return handoff_input_data + + monkeypatch.setattr("agents._run_impl.nest_handoff_history", fake_nest) + + result = await RunImpl.execute_handoffs( + agent=source_agent, + original_input=list(original_input), + pre_step_items=pre_step_items, + new_step_items=new_step_items, + new_response=new_response, + run_handoffs=run_handoffs, + hooks=hooks, + context_wrapper=context_wrapper, + run_config=run_config, + ) + + assert calls == [] + assert result.original_input == original_input + + +@pytest.mark.asyncio +async def test_handoff_can_enable_history_nesting(monkeypatch: pytest.MonkeyPatch): + source_agent = Agent(name="source") + target_agent = Agent(name="target") + override_handoff = handoff(target_agent, nest_handoff_history=True) + tool_call = cast(ResponseFunctionToolCall, get_handoff_tool_call(target_agent)) + run_handoffs = [ToolRunHandoff(handoff=override_handoff, tool_call=tool_call)] + run_config = RunConfig(nest_handoff_history=False) + context_wrapper = RunContextWrapper(context=None) + hooks = RunHooks() + original_input = [get_text_input_item("hello")] + pre_step_items: list[RunItem] = [] + new_step_items: list[RunItem] = [] + new_response = ModelResponse(output=[tool_call], usage=Usage(), response_id=None) + + def fake_nest( + handoff_input_data: HandoffInputData, + *, + history_mapper: Any, + ) -> HandoffInputData: + return handoff_input_data.clone( + input_history=( + { + "role": "assistant", + "content": "nested", + }, + ) + ) + + monkeypatch.setattr("agents._run_impl.nest_handoff_history", fake_nest) + + result = await RunImpl.execute_handoffs( + agent=source_agent, + original_input=list(original_input), + pre_step_items=pre_step_items, + new_step_items=new_step_items, + new_response=new_response, + run_handoffs=run_handoffs, + hooks=hooks, + context_wrapper=context_wrapper, + run_config=run_config, + ) + + assert result.original_input == [ + { + "role": "assistant", + "content": "nested", + } + ] + + @pytest.mark.asyncio async def test_missing_handoff_fails(): agent_1 = Agent(name="test_1")