|
| 1 | +# Amazon SageMaker |
| 2 | + |
| 3 | +[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed machine learning service that provides infrastructure and tools for building, training, and deploying ML models at scale. The Strands Agents SDK implements a SageMaker provider, allowing you to run agents against models deployed on SageMaker inference endpoints, including both pre-trained models from SageMaker JumpStart and custom fine-tuned models. The provider is designed to work with models that support OpenAI-compatible chat completion APIs. |
| 4 | + |
| 5 | +For example, you can expose models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) on SageMaker, which has demonstrated reliable performance for conversational AI and tool calling scenarios. |
| 6 | + |
| 7 | +## Installation |
| 8 | + |
| 9 | +SageMaker is configured as an optional dependency in Strands Agents. To install, run: |
| 10 | + |
| 11 | +```bash |
| 12 | +pip install 'strands-agents[sagemaker]' |
| 13 | +``` |
| 14 | + |
| 15 | +## Usage |
| 16 | + |
| 17 | +After installing the SageMaker dependencies, you can import and initialize the Strands Agents' SageMaker provider as follows: |
| 18 | + |
| 19 | +```python |
| 20 | +from strands import Agent |
| 21 | +from strands.models.sagemaker import SageMakerAIModel |
| 22 | +from strands_tools import calculator |
| 23 | + |
| 24 | +model = SageMakerAIModel( |
| 25 | + endpoint_config={ |
| 26 | + "endpoint_name": "my-llm-endpoint", |
| 27 | + "region_name": "us-west-2", |
| 28 | + }, |
| 29 | + payload_config={ |
| 30 | + "max_tokens": 1000, |
| 31 | + "temperature": 0.7, |
| 32 | + "stream": True, |
| 33 | + } |
| 34 | +) |
| 35 | + |
| 36 | +agent = Agent(model=model, tools=[calculator]) |
| 37 | +response = agent("What is the square root of 64?") |
| 38 | +``` |
| 39 | + |
| 40 | +**Note**: Tool calling support varies by model. Models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) have demonstrated reliable tool calling capabilities, but not all models deployed on SageMaker support this feature. Verify your model's capabilities before implementing tool-based workflows. |
| 41 | + |
| 42 | + |
| 43 | +## Configuration |
| 44 | + |
| 45 | +### Endpoint Configuration |
| 46 | + |
| 47 | +The `endpoint_config` configures the SageMaker endpoint connection: |
| 48 | + |
| 49 | +| Parameter | Description | Required | Example | |
| 50 | +|-----------|-------------|----------|---------| |
| 51 | +| `endpoint_name` | Name of the SageMaker endpoint | Yes | `"my-llm-endpoint"` | |
| 52 | +| `region_name` | AWS region where the endpoint is deployed | Yes | `"us-west-2"` | |
| 53 | +| `inference_component_name` | Name of the inference component | No | `"my-component"` | |
| 54 | +| `target_model` | Specific model to invoke (multi-model endpoints) | No | `"model-a.tar.gz"` | |
| 55 | +| `target_variant` | Production variant to invoke | No | `"variant-1"` | |
| 56 | + |
| 57 | +### Payload Configuration |
| 58 | + |
| 59 | +The `payload_config` configures the model inference parameters: |
| 60 | + |
| 61 | +| Parameter | Description | Default | Example | |
| 62 | +|-----------|-------------|---------|---------| |
| 63 | +| `max_tokens` | Maximum number of tokens to generate | Required | `1000` | |
| 64 | +| `stream` | Enable streaming responses | `True` | `True` | |
| 65 | +| `temperature` | Sampling temperature (0.0 to 2.0) | Optional | `0.7` | |
| 66 | +| `top_p` | Nucleus sampling parameter (0.0 to 1.0) | Optional | `0.9` | |
| 67 | +| `top_k` | Top-k sampling parameter | Optional | `50` | |
| 68 | +| `stop` | List of stop sequences | Optional | `["Human:", "AI:"]` | |
| 69 | + |
| 70 | +## Model Compatibility |
| 71 | + |
| 72 | +The SageMaker provider is designed to work with models that support OpenAI-compatible chat completion APIs. During development and testing, the provider has been validated with [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/), which demonstrated reliable performance across various conversational AI tasks. |
| 73 | + |
| 74 | +### Important Considerations |
| 75 | + |
| 76 | +- **Model Performance**: Results and capabilities vary significantly depending on the specific model deployed to your SageMaker endpoint |
| 77 | +- **Tool Calling Support**: Not all models deployed on SageMaker support function/tool calling. Verify your model's capabilities before implementing tool-based workflows |
| 78 | +- **API Compatibility**: Ensure your deployed model accepts and returns data in the OpenAI chat completion format |
| 79 | + |
| 80 | +For optimal results, we recommend testing your specific model deployment with your use case requirements before production deployment. |
| 81 | + |
| 82 | +## Troubleshooting |
| 83 | + |
| 84 | +### Module Not Found |
| 85 | + |
| 86 | +If you encounter `ModuleNotFoundError: No module named 'boto3'` or similar, install the SageMaker dependencies: |
| 87 | + |
| 88 | +```bash |
| 89 | +pip install 'strands-agents[sagemaker]' |
| 90 | +``` |
| 91 | + |
| 92 | +### Authentication |
| 93 | + |
| 94 | +The SageMaker provider uses standard AWS authentication methods (credentials file, environment variables, IAM roles, or AWS SSO). Ensure your AWS credentials have the necessary SageMaker invoke permissions. |
| 95 | + |
| 96 | +### Model Compatibility |
| 97 | + |
| 98 | +Ensure your deployed model supports OpenAI-compatible chat completion APIs and verify tool calling capabilities if needed. Refer to the [Model Compatibility](#model-compatibility) section above for detailed requirements and testing recommendations. |
| 99 | + |
| 100 | +## References |
| 101 | + |
| 102 | +- [API Reference](../../../api-reference/models.md) |
| 103 | +- [Amazon SageMaker Documentation](https://docs.aws.amazon.com/sagemaker/) |
| 104 | +- [SageMaker Runtime API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html) |
0 commit comments