Skip to content

Commit c60f5b2

Browse files
authored
feat: Add docs for Sagemaker model provider (#186)
* feat: Add docs for Sagemaker model provider * Updates based on PR comments
1 parent 1fa9680 commit c60f5b2

File tree

2 files changed

+105
-0
lines changed

2 files changed

+105
-0
lines changed
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Amazon SageMaker
2+
3+
[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed machine learning service that provides infrastructure and tools for building, training, and deploying ML models at scale. The Strands Agents SDK implements a SageMaker provider, allowing you to run agents against models deployed on SageMaker inference endpoints, including both pre-trained models from SageMaker JumpStart and custom fine-tuned models. The provider is designed to work with models that support OpenAI-compatible chat completion APIs.
4+
5+
For example, you can expose models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) on SageMaker, which has demonstrated reliable performance for conversational AI and tool calling scenarios.
6+
7+
## Installation
8+
9+
SageMaker is configured as an optional dependency in Strands Agents. To install, run:
10+
11+
```bash
12+
pip install 'strands-agents[sagemaker]'
13+
```
14+
15+
## Usage
16+
17+
After installing the SageMaker dependencies, you can import and initialize the Strands Agents' SageMaker provider as follows:
18+
19+
```python
20+
from strands import Agent
21+
from strands.models.sagemaker import SageMakerAIModel
22+
from strands_tools import calculator
23+
24+
model = SageMakerAIModel(
25+
endpoint_config={
26+
"endpoint_name": "my-llm-endpoint",
27+
"region_name": "us-west-2",
28+
},
29+
payload_config={
30+
"max_tokens": 1000,
31+
"temperature": 0.7,
32+
"stream": True,
33+
}
34+
)
35+
36+
agent = Agent(model=model, tools=[calculator])
37+
response = agent("What is the square root of 64?")
38+
```
39+
40+
**Note**: Tool calling support varies by model. Models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) have demonstrated reliable tool calling capabilities, but not all models deployed on SageMaker support this feature. Verify your model's capabilities before implementing tool-based workflows.
41+
42+
43+
## Configuration
44+
45+
### Endpoint Configuration
46+
47+
The `endpoint_config` configures the SageMaker endpoint connection:
48+
49+
| Parameter | Description | Required | Example |
50+
|-----------|-------------|----------|---------|
51+
| `endpoint_name` | Name of the SageMaker endpoint | Yes | `"my-llm-endpoint"` |
52+
| `region_name` | AWS region where the endpoint is deployed | Yes | `"us-west-2"` |
53+
| `inference_component_name` | Name of the inference component | No | `"my-component"` |
54+
| `target_model` | Specific model to invoke (multi-model endpoints) | No | `"model-a.tar.gz"` |
55+
| `target_variant` | Production variant to invoke | No | `"variant-1"` |
56+
57+
### Payload Configuration
58+
59+
The `payload_config` configures the model inference parameters:
60+
61+
| Parameter | Description | Default | Example |
62+
|-----------|-------------|---------|---------|
63+
| `max_tokens` | Maximum number of tokens to generate | Required | `1000` |
64+
| `stream` | Enable streaming responses | `True` | `True` |
65+
| `temperature` | Sampling temperature (0.0 to 2.0) | Optional | `0.7` |
66+
| `top_p` | Nucleus sampling parameter (0.0 to 1.0) | Optional | `0.9` |
67+
| `top_k` | Top-k sampling parameter | Optional | `50` |
68+
| `stop` | List of stop sequences | Optional | `["Human:", "AI:"]` |
69+
70+
## Model Compatibility
71+
72+
The SageMaker provider is designed to work with models that support OpenAI-compatible chat completion APIs. During development and testing, the provider has been validated with [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/), which demonstrated reliable performance across various conversational AI tasks.
73+
74+
### Important Considerations
75+
76+
- **Model Performance**: Results and capabilities vary significantly depending on the specific model deployed to your SageMaker endpoint
77+
- **Tool Calling Support**: Not all models deployed on SageMaker support function/tool calling. Verify your model's capabilities before implementing tool-based workflows
78+
- **API Compatibility**: Ensure your deployed model accepts and returns data in the OpenAI chat completion format
79+
80+
For optimal results, we recommend testing your specific model deployment with your use case requirements before production deployment.
81+
82+
## Troubleshooting
83+
84+
### Module Not Found
85+
86+
If you encounter `ModuleNotFoundError: No module named 'boto3'` or similar, install the SageMaker dependencies:
87+
88+
```bash
89+
pip install 'strands-agents[sagemaker]'
90+
```
91+
92+
### Authentication
93+
94+
The SageMaker provider uses standard AWS authentication methods (credentials file, environment variables, IAM roles, or AWS SSO). Ensure your AWS credentials have the necessary SageMaker invoke permissions.
95+
96+
### Model Compatibility
97+
98+
Ensure your deployed model supports OpenAI-compatible chat completion APIs and verify tool calling capabilities if needed. Refer to the [Model Compatibility](#model-compatibility) section above for detailed requirements and testing recommendations.
99+
100+
## References
101+
102+
- [API Reference](../../../api-reference/models.md)
103+
- [Amazon SageMaker Documentation](https://docs.aws.amazon.com/sagemaker/)
104+
- [SageMaker Runtime API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ nav:
9494
- MistralAI: user-guide/concepts/model-providers/mistral.md
9595
- Ollama: user-guide/concepts/model-providers/ollama.md
9696
- OpenAI: user-guide/concepts/model-providers/openai.md
97+
- SageMaker: user-guide/concepts/model-providers/sagemaker.md
9798
- Writer: user-guide/concepts/model-providers/writer.md
9899
- Cohere: user-guide/concepts/model-providers/cohere.md
99100
- Custom Providers: user-guide/concepts/model-providers/custom_model_provider.md

0 commit comments

Comments
 (0)