feat(bedrock): add docs for SystemContentBlock caching approach (#314)

dbschmigelski · web-flow · commit 4eaba8dcc622 · 2025-11-07T09:11:39.000-05:00
diff --git a/docs/user-guide/concepts/model-providers/amazon-bedrock.md b/docs/user-guide/concepts/model-providers/amazon-bedrock.md
@@ -332,37 +332,66 @@ For detailed information about supported models, minimum token requirements, and
 
 #### System Prompt Caching
 
-System prompt caching allows you to reuse a cached system prompt across multiple requests:
+System prompt caching allows you to reuse a cached system prompt across multiple requests. Strands supports two approaches for system prompt caching:
+
+**Provider-Agnostic Approach (Recommended)**
+
+Use SystemContentBlock arrays to define cache points that work across all model providers:
+
+```python
+from strands import Agent
+from strands.types.content import SystemContentBlock
+
+# Define system content with cache points
+system_content = [
+    SystemContentBlock(
+        text="You are a helpful assistant that provides concise answers. "
+             "This is a long system prompt with detailed instructions..."
+             "..." * 1600  # needs to be at least 1,024 tokens
+    ),
+    SystemContentBlock(cachePoint={"type": "default"})
+]
+
+# Create an agent with SystemContentBlock array
+agent = Agent(system_prompt=system_content)
+
+# First request will cache the system prompt
+response1 = agent("Tell me about Python")
+print(f"Cache write tokens: {response1.metrics.accumulated_usage.get('cacheWriteInputTokens')}")
+print(f"Cache read tokens: {response1.metrics.accumulated_usage.get('cacheReadInputTokens')}")
+
+# Second request will reuse the cached system prompt
+response2 = agent("Tell me about JavaScript")
+print(f"Cache write tokens: {response2.metrics.accumulated_usage.get('cacheWriteInputTokens')}")
+print(f"Cache read tokens: {response2.metrics.accumulated_usage.get('cacheReadInputTokens')}")
+```
+
+**Legacy Bedrock-Specific Approach**
+
+For backwards compatibility, you can still use the Bedrock-specific `cache_prompt` configuration:
 
 ```python
 from strands import Agent
 from strands.models import BedrockModel
 
-# Using system prompt caching with BedrockModel
+# Using legacy system prompt caching with BedrockModel
 bedrock_model = BedrockModel(
     model_id="anthropic.claude-sonnet-4-20250514-v1:0",
-    cache_prompt="default"
+    cache_prompt="default"  # This approach is deprecated
 )
 
 # Create an agent with the model
 agent = Agent(
     model=bedrock_model,
     system_prompt="You are a helpful assistant that provides concise answers. " +
                  "This is a long system prompt with detailed instructions... "
-                 # Add enough text to reach the minimum token requirement for your model
 )
 
-# First request will cache the system prompt
-response1 = agent("Tell me about Python")
-print(f"Cache write tokens: {response1.metrics.accumulated_usage.get('cacheWriteInputTokens')}")
-print(f"Cache read tokens: {response1.metrics.accumulated_usage.get('cacheReadInputTokens')}")
-
-# Second request will reuse the cached system prompt
-response2 = agent("Tell me about JavaScript")
-print(f"Cache write tokens: {response2.metrics.accumulated_usage.get('cacheWriteInputTokens')}")
-print(f"Cache read tokens: {response2.metrics.accumulated_usage.get('cacheReadInputTokens')}")
+response = agent("Tell me about Python")
 ```
 
+> **Note**: The `cache_prompt` configuration is deprecated in favor of the provider-agnostic SystemContentBlock approach. The new approach enables caching across all model providers through a unified interface.
+
 #### Tool Caching
 
 Tool caching allows you to reuse a cached tool definition across multiple requests: