Anthropic silently downgraded cache TTL from 1h → 5M on March 6th
A developer provides detailed data analysis alleging that Anthropic silently downgraded the default cache Time-To-Live (TTL) for its Claude Code API from 1 hour to 5 minutes. This unannounced server-side change significantly inflated costs by 20-32% and caused subscription users to hit their quotas more frequently. The story resonates on HN as a critical examination of transparency and billing practices in the rapidly evolving AI API landscape.
The Lowdown
An in-depth analysis of Claude Code session logs reveals that Anthropic seemingly made a covert, server-side change to its API's default prompt cache TTL. This adjustment, which occurred around March 6-8, 2026, reduced the TTL from a consistent 1 hour to just 5 minutes, leading to substantial financial and quota implications for users.
- The analysis, based on 119,866 API calls across two independent machines and accounts, demonstrates a clear shift in cache behavior.
- From February 1 to March 5, 2026, the API consistently used a 1-hour TTL, indicating it was likely the intended default.
- Starting March 6, 5-minute TTL calls began to reappear, quickly becoming dominant by March 8, despite no client-side changes.
- This reversion resulted in a 20-32% increase in cache creation costs for users, as frequent cache expirations forced re-uploads at higher 'write' rates instead of cheaper 'read' rates.
- Pro/subscription users also reported hitting their quota limits for the first time, directly correlating with the TTL change.
- The author hypothesizes that the 1-hour TTL was the deliberate default, and the subsequent downgrade was either an intentional cost-saving measure or an accidental regression.
- The post requests Anthropic to confirm the change, clarify the intended TTL, consider restoring the 1-hour default, and disclose quota counting for cache reads.
This meticulous investigation highlights how a seemingly minor, unannounced technical alteration can have significant financial consequences for developers and raises important questions about transparency and fair billing practices from major AI service providers.