Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
5a7076b
Don't re-run workflows on un/approvals (#516)
zastrowm Jul 22, 2025
9aba018
Fixing some typos in various texts (#487)
didier-durand Jul 22, 2025
040ba21
docs(readme): add hot reloading documentation for load_tools_from_dir…
cagataycali Jul 22, 2025
022ec55
ci: enable integ tests for anthropic, cohere, mistral, openai, writer…
dbschmigelski Jul 22, 2025
e597e07
Automatically flatten nested tool collections (#508)
zastrowm Jul 23, 2025
4f4e5ef
feat(a2a): support mounts for containerized deployments (#524)
jer96 Jul 23, 2025
b30e7e6
fix: include agent trace into tool for agent as tools (#526)
poshinchen Jul 23, 2025
8c55625
Support for Amazon SageMaker AI endpoints as Model Provider (#176)
dgallitelli Jul 28, 2025
3f4c3a3
fix: Remove leftover print statement from sagemaker model provider (#…
mehtarac Jul 28, 2025
bdc893b
[Feat] Update structured output error message (#563)
Unshure Jul 29, 2025
4e0e0a6
feat(mcp): retain structured content in the AgentTool response (#528)
dbschmigelski Jul 29, 2025
b13c5c5
feat(mcp): Add list_prompts, get_prompt methods (#160)
Ketansuhaas Jul 30, 2025
c5e4e51
fix(event_loop): raise dedicated exception when encountering max toke…
dbschmigelski Jul 30, 2025
6703819
fix: update integ tests
dbschmigelski Jul 30, 2025
3d526f2
fix(deps): pin a2a-sdk>=0.2.16 to resolve #572 (#581)
minorun365 Jul 31, 2025
c94b74e
fix: rename exception message, add to exception, move earlier in cycle
dbschmigelski Jul 31, 2025
36dd0f9
Update tests_integ/test_max_tokens_reached.py
dbschmigelski Jul 31, 2025
e04c73d
Update tests_integ/test_max_tokens_reached.py
dbschmigelski Jul 31, 2025
cca2f86
linting
dbschmigelski Jul 31, 2025
f647baa
Merge branch 'strands-agents:main' into fix-max-tokens
dbschmigelski Jul 31, 2025
b56a4ff
chore: pin a2a to a minor version while it is still in beta (#586)
dbschmigelski Aug 1, 2025
78c5a91
Merge branch 'strands-agents:main' into fix-max-tokens
dbschmigelski Aug 1, 2025
8b1de4d
fix: uses new a2a snake_case for lints to pass (#591)
theagenticguy Aug 1, 2025
c85464c
fix(event_loop): raise dedicated exception when encountering max toke…
dbschmigelski Aug 1, 2025
a208496
Merge branch 'strands-agents:main' into fix-max-tokens
dbschmigelski Aug 4, 2025
2e2d4df
feat: add builtin hook provider to address max tokens reached truncation
dbschmigelski Aug 4, 2025
447d147
tests: modify integ test to inspect message history
dbschmigelski Aug 4, 2025
564895d
fix: fix linting errors
dbschmigelski Aug 4, 2025
2f118fb
fix: linting
dbschmigelski Aug 4, 2025
e5fc51a
refactor: switch from hook approach to conversation manager
dbschmigelski Aug 5, 2025
5906fc2
linting
dbschmigelski Aug 5, 2025
87445a3
fix: test contained incorrect assertions
dbschmigelski Aug 6, 2025
924fea9
fix: add event emission
dbschmigelski Aug 6, 2025
104f6b4
feat: move to async
dbschmigelski Aug 6, 2025
11b91f4
feat: add additional error case where no tool uses were fixed
dbschmigelski Aug 6, 2025
1da9ba7
feat: add max tokens reached test
dbschmigelski Aug 6, 2025
623f3c7
linting
dbschmigelski Aug 6, 2025
66c4c07
feat: add max tokens reached test
dbschmigelski Aug 6, 2025
4b5c5a7
feat: switch to a default behavior to recover from max tokens reached
dbschmigelski Aug 7, 2025
83ad822
fix: all tool uses now must be replaced
dbschmigelski Aug 8, 2025
faa4618
fix: boolean
dbschmigelski Aug 8, 2025
fa8195f
remove todo
dbschmigelski Aug 8, 2025
d521a2c
Update README.md
dbschmigelski Oct 23, 2025
e57e398
Update README.md
dbschmigelski Oct 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/pr-and-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Pull Request and Push Action
on:
pull_request: # Safer than pull_request_target for untrusted code
branches: [ main ]
types: [opened, synchronize, reopened, ready_for_review, review_requested, review_request_removed]
types: [opened, synchronize, reopened, ready_for_review]
push:
branches: [ main ] # Also run on direct pushes to main
concurrency:
Expand Down
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@
<a href="https://pypi.org/project/strands-agents/"><img alt="PyPI version" src="https://img.shields.io/pypi/v/strands-agents"/></a>
<a href="https://python.org"><img alt="Python versions" src="https://img.shields.io/pypi/pyversions/strands-agents"/></a>
</div>




<p>
<a href="https://strandsagents.com/">Documentation</a>
Expand Down Expand Up @@ -91,6 +94,17 @@ agent = Agent(tools=[word_count])
response = agent("How many words are in this sentence?")
```

**Hot Reloading from Directory:**
Enable automatic tool loading and reloading from the `./tools/` directory:

```python
from strands import Agent

# Agent will watch ./tools/ directory for changes
agent = Agent(load_tools_from_directory=True)
response = agent("Use any tools you find in the tools directory")
```

### MCP Support

Seamlessly integrate Model Context Protocol (MCP) servers:
Expand Down
19 changes: 13 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ dependencies = [
"boto3>=1.26.0,<2.0.0",
"botocore>=1.29.0,<2.0.0",
"docstring_parser>=0.15,<1.0",
"mcp>=1.8.0,<2.0.0",
"mcp>=1.11.0,<2.0.0",
"pydantic>=2.0.0,<3.0.0",
"typing-extensions>=4.13.2,<5.0.0",
"watchdog>=6.0.0,<7.0.0",
Expand Down Expand Up @@ -89,8 +89,15 @@ writer = [
"writer-sdk>=2.2.0,<3.0.0"
]

sagemaker = [
"boto3>=1.26.0,<2.0.0",
"botocore>=1.29.0,<2.0.0",
"boto3-stubs[sagemaker-runtime]>=1.26.0,<2.0.0"
]

a2a = [
"a2a-sdk[sql]>=0.2.16,<1.0.0",
"a2a-sdk>=0.3.0,<0.4.0",
"a2a-sdk[sql]>=0.3.0,<0.4.0",
"uvicorn>=0.34.2,<1.0.0",
"httpx>=0.28.1,<1.0.0",
"fastapi>=0.115.12,<1.0.0",
Expand Down Expand Up @@ -136,7 +143,7 @@ all = [
"opentelemetry-exporter-otlp-proto-http>=1.30.0,<2.0.0",

# a2a
"a2a-sdk[sql]>=0.2.16,<1.0.0",
"a2a-sdk[sql]>=0.3.0,<0.4.0",
"uvicorn>=0.34.2,<1.0.0",
"httpx>=0.28.1,<1.0.0",
"fastapi>=0.115.12,<1.0.0",
Expand All @@ -148,7 +155,7 @@ all = [
source = "vcs"

[tool.hatch.envs.hatch-static-analysis]
features = ["anthropic", "litellm", "llamaapi", "ollama", "openai", "otel", "mistral", "writer", "a2a"]
features = ["anthropic", "litellm", "llamaapi", "ollama", "openai", "otel", "mistral", "writer", "a2a", "sagemaker"]
dependencies = [
"mypy>=1.15.0,<2.0.0",
"ruff>=0.11.6,<0.12.0",
Expand All @@ -171,7 +178,7 @@ lint-fix = [
]

[tool.hatch.envs.hatch-test]
features = ["anthropic", "litellm", "llamaapi", "ollama", "openai", "otel", "mistral", "writer", "a2a"]
features = ["anthropic", "litellm", "llamaapi", "ollama", "openai", "otel", "mistral", "writer", "a2a", "sagemaker"]
extra-dependencies = [
"moto>=5.1.0,<6.0.0",
"pytest>=8.0.0,<9.0.0",
Expand All @@ -187,7 +194,7 @@ extra-args = [

[tool.hatch.envs.dev]
dev-mode = true
features = ["dev", "docs", "anthropic", "litellm", "llamaapi", "ollama", "otel", "mistral", "writer", "a2a"]
features = ["dev", "docs", "anthropic", "litellm", "llamaapi", "ollama", "otel", "mistral", "writer", "a2a", "sagemaker"]

[[tool.hatch.envs.hatch-test.matrix]]
python = ["3.13", "3.12", "3.11", "3.10"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def restore_from_session(self, state: dict[str, Any]) -> Optional[list[Message]]
Args:
state: Previous state of the conversation manager
Returns:
Optional list of messages to prepend to the agents messages. By defualt returns None.
Optional list of messages to prepend to the agents messages. By default returns None.
"""
if state.get("__name__") != self.__class__.__name__:
raise ValueError("Invalid conversation manager state.")
Expand Down
71 changes: 71 additions & 0 deletions src/strands/event_loop/_recover_message_on_max_tokens_reached.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
"""Message recovery utilities for handling max token limit scenarios.

This module provides functionality to recover and clean up incomplete messages that occur
when model responses are truncated due to maximum token limits being reached. It specifically
handles cases where tool use blocks are incomplete or malformed due to truncation.
"""

import logging

from ..types.content import ContentBlock, Message
from ..types.tools import ToolUse

logger = logging.getLogger(__name__)


def recover_message_on_max_tokens_reached(message: Message) -> Message:
"""Recover and clean up messages when max token limits are reached.

When a model response is truncated due to maximum token limits, all tool use blocks
should be replaced with informative error messages since they may be incomplete or
unreliable. This function inspects the message content and:

1. Identifies all tool use blocks (regardless of validity)
2. Replaces all tool uses with informative error messages
3. Preserves all non-tool content blocks (text, images, etc.)
4. Returns a cleaned message suitable for conversation history

This recovery mechanism ensures that the conversation can continue gracefully even when
model responses are truncated, providing clear feedback about what happened and preventing
potentially incomplete or corrupted tool executions.

Args:
message: The potentially incomplete message from the model that was truncated
due to max token limits.

Returns:
A cleaned Message with all tool uses replaced by explanatory text content.
The returned message maintains the same role as the input message.

Example:
If a message contains any tool use (complete or incomplete):
```
{"toolUse": {"name": "calculator", "input": {"expression": "2+2"}, "toolUseId": "123"}}
```

It will be replaced with:
```
{"text": "The selected tool calculator's tool use was incomplete due to maximum token limits being reached."}
```
"""
logger.info("handling max_tokens stop reason - replacing all tool uses with error messages")

valid_content: list[ContentBlock] = []
for content in message["content"] or []:
tool_use: ToolUse | None = content.get("toolUse")
if not tool_use:
valid_content.append(content)
continue

# Replace all tool uses with error messages when max_tokens is reached
display_name = tool_use.get("name") or "<unknown>"
logger.warning("tool_name=<%s> | replacing with error message due to max_tokens truncation.", display_name)

valid_content.append(
{
"text": f"The selected tool {display_name}'s tool use was incomplete due "
f"to maximum token limits being reached."
}
)

return {"content": valid_content, "role": message["role"]}
30 changes: 28 additions & 2 deletions src/strands/event_loop/event_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,15 @@
from ..telemetry.tracer import get_tracer
from ..tools.executor import run_tools, validate_and_prepare_tools
from ..types.content import Message
from ..types.exceptions import ContextWindowOverflowException, EventLoopException, ModelThrottledException
from ..types.exceptions import (
ContextWindowOverflowException,
EventLoopException,
MaxTokensReachedException,
ModelThrottledException,
)
from ..types.streaming import Metrics, StopReason
from ..types.tools import ToolChoice, ToolChoiceAuto, ToolConfig, ToolGenerator, ToolResult, ToolUse
from ._recover_message_on_max_tokens_reached import recover_message_on_max_tokens_reached
from .streaming import stream_messages

if TYPE_CHECKING:
Expand Down Expand Up @@ -151,6 +157,9 @@ async def event_loop_cycle(agent: "Agent", invocation_state: dict[str, Any]) ->
)
)

if stop_reason == "max_tokens":
message = recover_message_on_max_tokens_reached(message)

if model_invoke_span:
tracer.end_model_invoke_span(model_invoke_span, message, usage, stop_reason)
break # Success! Break out of retry loop
Expand Down Expand Up @@ -200,6 +209,22 @@ async def event_loop_cycle(agent: "Agent", invocation_state: dict[str, Any]) ->
agent.event_loop_metrics.update_usage(usage)
agent.event_loop_metrics.update_metrics(metrics)

if stop_reason == "max_tokens":
"""
Handle max_tokens limit reached by the model.

When the model reaches its maximum token limit, this represents a potentially unrecoverable
state where the model's response was truncated. By default, Strands fails hard with an
MaxTokensReachedException to maintain consistency with other failure types.
"""
raise MaxTokensReachedException(
message=(
"Agent has reached an unrecoverable state due to max_tokens limit. "
"For more information see: "
"https://strandsagents.com/latest/user-guide/concepts/agents/agent-loop/#maxtokensreachedexception"
)
)

# If the model is requesting to use tools
if stop_reason == "tool_use":
# Handle tool execution
Expand Down Expand Up @@ -231,7 +256,8 @@ async def event_loop_cycle(agent: "Agent", invocation_state: dict[str, Any]) ->
# Don't yield or log the exception - we already did it when we
# raised the exception and we don't need that duplication.
raise
except ContextWindowOverflowException as e:
except (ContextWindowOverflowException, MaxTokensReachedException) as e:
# Special cased exceptions which we want to bubble up rather than get wrapped in an EventLoopException
if cycle_span:
tracer.end_span_with_error(cycle_span, str(e), e)
raise e
Expand Down
2 changes: 1 addition & 1 deletion src/strands/models/anthropic.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ async def structured_output(
stop_reason, messages, _, _ = event["stop"]

if stop_reason != "tool_use":
raise ValueError("No valid tool use or tool use input was found in the Anthropic response.")
raise ValueError(f'Model returned stop_reason: {stop_reason} instead of "tool_use".')

content = messages["content"]
output_response: dict[str, Any] | None = None
Expand Down
55 changes: 51 additions & 4 deletions src/strands/models/bedrock.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@

from ..event_loop import streaming
from ..tools import convert_pydantic_to_tool_spec
from ..types.content import Messages
from ..types.content import ContentBlock, Message, Messages
from ..types.exceptions import ContextWindowOverflowException, ModelThrottledException
from ..types.streaming import StreamEvent
from ..types.tools import ToolSpec
from ..types.tools import ToolResult, ToolSpec
from .model import Model

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -181,7 +181,7 @@ def format_request(
"""
return {
"modelId": self.config["model_id"],
"messages": messages,
"messages": self._format_bedrock_messages(messages),
"system": [
*([{"text": system_prompt}] if system_prompt else []),
*([{"cachePoint": {"type": self.config["cache_prompt"]}}] if self.config.get("cache_prompt") else []),
Expand Down Expand Up @@ -246,6 +246,53 @@ def format_request(
),
}

def _format_bedrock_messages(self, messages: Messages) -> Messages:
"""Format messages for Bedrock API compatibility.

This function ensures messages conform to Bedrock's expected format by:
- Cleaning tool result content blocks by removing additional fields that may be
useful for retaining information in hooks but would cause Bedrock validation
exceptions when presented with unexpected fields
- Ensuring all message content blocks are properly formatted for the Bedrock API

Args:
messages: List of messages to format

Returns:
Messages formatted for Bedrock API compatibility

Note:
Bedrock will throw validation exceptions when presented with additional
unexpected fields in tool result blocks.
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolResultBlock.html
"""
cleaned_messages = []

for message in messages:
cleaned_content: list[ContentBlock] = []

for content_block in message["content"]:
if "toolResult" in content_block:
# Create a new content block with only the cleaned toolResult
tool_result: ToolResult = content_block["toolResult"]

# Keep only the required fields for Bedrock
cleaned_tool_result = ToolResult(
content=tool_result["content"], toolUseId=tool_result["toolUseId"], status=tool_result["status"]
)

cleaned_block: ContentBlock = {"toolResult": cleaned_tool_result}
cleaned_content.append(cleaned_block)
else:
# Keep other content blocks as-is
cleaned_content.append(content_block)

# Create new message with cleaned content
cleaned_message: Message = Message(content=cleaned_content, role=message["role"])
cleaned_messages.append(cleaned_message)

return cleaned_messages

def _has_blocked_guardrail(self, guardrail_data: dict[str, Any]) -> bool:
"""Check if guardrail data contains any blocked policies.

Expand Down Expand Up @@ -584,7 +631,7 @@ async def structured_output(
stop_reason, messages, _, _ = event["stop"]

if stop_reason != "tool_use":
raise ValueError("No valid tool use or tool use input was found in the Bedrock response.")
raise ValueError(f'Model returned stop_reason: {stop_reason} instead of "tool_use".')

content = messages["content"]
output_response: dict[str, Any] | None = None
Expand Down
Loading
Loading