Amazon Bedrock now Supports One-Hour Prompt Caching Toward Cost Efficiency
Amazon Web Services has added one-hour prompt caching to Amazon Bedrock for selected Anthropic Claude models, allowing developers to reuse prompts across multiple model invocations without resending the same input each time. The update improves the original five-minute default to an hour, ideal for longer-running agentic workflows and multi-turn conversations. This maintains context for users who may interact less frequently or need time in between longer sessions.
Prompt caching addresses a growing challenge as enterprises scale generative AI beyond simple, single-turn requests. By enabling short-duration caching, AWS provides a mechanism to improve responsiveness while maintaining flexibility, without requiring developers to manage long-lived state externally. Competing platforms have introduced similar techniques to optimize conversational and agentic AI workloads. For engineering teams building real-time or user-facing AI services on Bedrock, one-hour prompt caching offers a practical way to balance performance, cost efficiency, and architectural simplicity as usage scales. The one-hour time-to-live prompt caching update is generally available on Claude Sonnet 4.5, Haiku 4.5 and Opus 4.5 in AWS regions and AWS GovCloud (US) regions.
The "AWS Release Radar" blog is researched, fact-checked, edited and updated by the editors of AWSInsider.net, with writing assistance from AI. To submit your channel company's press release for consideration, contact Ammaarah Mohamed.
Posted by AWS Editors on 01/27/2026