You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please check [Usage Guide](./docs/Usage.md) for more details about how to use embedding API, multimodal API and tool call.
150
152
153
+
### Application Inference Profiles
151
154
155
+
This proxy now supports **Application Inference Profiles**, which allow you to track usage and costs for your model invocations. You can use application inference profiles created in your AWS account for cost tracking and monitoring purposes.
156
+
157
+
**Using Application Inference Profiles:**
158
+
159
+
```bash
160
+
# Use an application inference profile ARN as the model ID
-**Cost Tracking**: Track usage and costs for specific applications or use cases
191
+
-**Usage Monitoring**: Monitor model invocation metrics through CloudWatch
192
+
-**Tag-based Cost Allocation**: Use AWS cost allocation tags for detailed billing analysis
193
+
194
+
For more information about creating and managing application inference profiles, see the [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-create.html).
**Important Notice**: Please carefully review the following points before using reasoning mode for Chat completion API.
450
+
451
+
Extended thinking with tool use in Claude 4 models supports [interleaved thinking](https://docs.aws.amazon.com/bedrock/latest/userguide/claude-messages-extended-thinking.html#claude-messages-extended-thinking-tool-use-interleaved) enables Claude 4 models to think between tool calls and run more sophisticated reasoning after receiving tool results. which is helpful for more complex agentic interactions.
452
+
With interleaved thinking, the `budget_tokens` can exceed the `max_tokens` parameter because it represents the total budget across all thinking blocks within one assistant turn.
**重要提示**:在使用 Chat Completion API 的推理模式(reasoning mode)前,请务必仔细阅读以下内容。
450
+
451
+
Claude 4 模型支持借助工具使用的扩展思维功能(Extended Thinking),其中包含交错思考([interleaved thinking](https://docs.aws.amazon.com/bedrock/latest/userguide/claude-messages-extended-thinking.html#claude-messages-extended-thinking-tool-use-interleaved) )。该功能使 Claude 4 可以在多次调用工具之间进行思考,并在收到工具结果后执行更复杂的推理,这对处理更复杂的 Agentic AI 交互非常有帮助。
0 commit comments