@@ -332,37 +332,66 @@ For detailed information about supported models, minimum token requirements, and
332332
333333#### System Prompt Caching
334334
335- System prompt caching allows you to reuse a cached system prompt across multiple requests:
335+ System prompt caching allows you to reuse a cached system prompt across multiple requests. Strands supports two approaches for system prompt caching:
336+
337+ ** Provider-Agnostic Approach (Recommended)**
338+
339+ Use SystemContentBlock arrays to define cache points that work across all model providers:
340+
341+ ``` python
342+ from strands import Agent
343+ from strands.types.content import SystemContentBlock
344+
345+ # Define system content with cache points
346+ system_content = [
347+ SystemContentBlock(
348+ text = " You are a helpful assistant that provides concise answers. "
349+ " This is a long system prompt with detailed instructions..."
350+ " ..." * 1600 # needs to be at least 1,024 tokens
351+ ),
352+ SystemContentBlock(cachePoint = {" type" : " default" })
353+ ]
354+
355+ # Create an agent with SystemContentBlock array
356+ agent = Agent(system_prompt = system_content)
357+
358+ # First request will cache the system prompt
359+ response1 = agent(" Tell me about Python" )
360+ print (f " Cache write tokens: { response1.metrics.accumulated_usage.get(' cacheWriteInputTokens' )} " )
361+ print (f " Cache read tokens: { response1.metrics.accumulated_usage.get(' cacheReadInputTokens' )} " )
362+
363+ # Second request will reuse the cached system prompt
364+ response2 = agent(" Tell me about JavaScript" )
365+ print (f " Cache write tokens: { response2.metrics.accumulated_usage.get(' cacheWriteInputTokens' )} " )
366+ print (f " Cache read tokens: { response2.metrics.accumulated_usage.get(' cacheReadInputTokens' )} " )
367+ ```
368+
369+ ** Legacy Bedrock-Specific Approach**
370+
371+ For backwards compatibility, you can still use the Bedrock-specific ` cache_prompt ` configuration:
336372
337373``` python
338374from strands import Agent
339375from strands.models import BedrockModel
340376
341- # Using system prompt caching with BedrockModel
377+ # Using legacy system prompt caching with BedrockModel
342378bedrock_model = BedrockModel(
343379 model_id = " anthropic.claude-sonnet-4-20250514-v1:0" ,
344- cache_prompt = " default"
380+ cache_prompt = " default" # This approach is deprecated
345381)
346382
347383# Create an agent with the model
348384agent = Agent(
349385 model = bedrock_model,
350386 system_prompt = " You are a helpful assistant that provides concise answers. " +
351387 " This is a long system prompt with detailed instructions... "
352- # Add enough text to reach the minimum token requirement for your model
353388)
354389
355- # First request will cache the system prompt
356- response1 = agent(" Tell me about Python" )
357- print (f " Cache write tokens: { response1.metrics.accumulated_usage.get(' cacheWriteInputTokens' )} " )
358- print (f " Cache read tokens: { response1.metrics.accumulated_usage.get(' cacheReadInputTokens' )} " )
359-
360- # Second request will reuse the cached system prompt
361- response2 = agent(" Tell me about JavaScript" )
362- print (f " Cache write tokens: { response2.metrics.accumulated_usage.get(' cacheWriteInputTokens' )} " )
363- print (f " Cache read tokens: { response2.metrics.accumulated_usage.get(' cacheReadInputTokens' )} " )
390+ response = agent(" Tell me about Python" )
364391```
365392
393+ > ** Note** : The ` cache_prompt ` configuration is deprecated in favor of the provider-agnostic SystemContentBlock approach. The new approach enables caching across all model providers through a unified interface.
394+
366395#### Tool Caching
367396
368397Tool caching allows you to reuse a cached tool definition across multiple requests:
0 commit comments