Don't trust large context windows

The article critically examines the common misconception surrounding large language model (LLM) context windows, asserting that their usable capacity is significantly less than advertised. It introduces the concept of a 'smart zone' where the model performs optimally and a 'dumb zone' where attention degrades, typically beyond 100k tokens.

LLM vendors heavily market ever-increasing context window sizes (200k, 1M, 2M tokens), but research indicates a significant drop-off in performance as the window fills.
Coding agents, due to their verbose nature with file reads, debug sessions, and test runs, quickly push LLMs into this less effective 'dumb zone.'
Existing mitigation strategies like auto-compaction (e.g., Claude Code) are imperfect, often summarizing after degradation has occurred and using an already compromised model.
The author advocates for a 'breadcrumb' approach: explicitly writing and passing specifications or summaries between LLM sessions, treating the context window as a budget.
This manual handoff ensures higher signal quality than automated summaries and keeps the active session within the 'smart zone.'
Projects like obra/superpowers and mattpocock/skills are cited as examples of structuring agent workflows around small, named artifacts to manage context effectively.

Ultimately, the piece advises developers to treat the LLM context window as a finite resource, actively managing information flow through external artifacts to maintain model performance and avoid the pitfalls of an overstuffed, underperforming context.

Don't trust large context windows

The Lowdown

The Gossip

Strategic Context Crafting

Tactical Context Trimming