Anthropic silently downgraded cache TTL from 1h → 5M on March 6th

An in-depth analysis of Claude Code session logs reveals that Anthropic seemingly made a covert, server-side change to its API's default prompt cache TTL. This adjustment, which occurred around March 6-8, 2026, reduced the TTL from a consistent 1 hour to just 5 minutes, leading to substantial financial and quota implications for users.

The analysis, based on 119,866 API calls across two independent machines and accounts, demonstrates a clear shift in cache behavior.
From February 1 to March 5, 2026, the API consistently used a 1-hour TTL, indicating it was likely the intended default.
Starting March 6, 5-minute TTL calls began to reappear, quickly becoming dominant by March 8, despite no client-side changes.
This reversion resulted in a 20-32% increase in cache creation costs for users, as frequent cache expirations forced re-uploads at higher 'write' rates instead of cheaper 'read' rates.
Pro/subscription users also reported hitting their quota limits for the first time, directly correlating with the TTL change.
The author hypothesizes that the 1-hour TTL was the deliberate default, and the subsequent downgrade was either an intentional cost-saving measure or an accidental regression.
The post requests Anthropic to confirm the change, clarify the intended TTL, consider restoring the 1-hour default, and disclose quota counting for cache reads.

This meticulous investigation highlights how a seemingly minor, unannounced technical alteration can have significant financial consequences for developers and raises important questions about transparency and fair billing practices from major AI service providers.

Anthropic silently downgraded cache TTL from 1h → 5M on March 6th

The Lowdown