HN
Today

Universal Claude.md – cut Claude output tokens by 63%

A new GitHub project introduces a clever .md file that, when placed in a project, dramatically cuts down on Claude's verbose and 'sycophantic' output, leading to significant token savings for output-heavy LLM interactions. This simple, no-code-change solution has sparked a lively debate on Hacker News about the true cost of tokens, the impact on AI reasoning, and the constant quest for LLM efficiency.

61
Score
27
Comments
#1
Highest Rank
18h
on Front Page
First Seen
Mar 31, 2:00 AM
Last Seen
Mar 31, 7:00 PM
Rank Over Time
11224544566791317242628

The Lowdown

The Universal CLAUDE.md project offers a straightforward, drop-in solution to reduce the verbosity and 'fluff' in Claude's AI responses. By simply adding a CLAUDE.md file to a project's root, users can achieve an average 63% reduction in output tokens, tackling issues like overly polite greetings, redundant explanations, and unhelpful formatting.

  • The Problem: Claude's default behavior often includes unnecessary pleasantries, repetition, unsolicited suggestions, and formatting that wastes tokens without adding value.
  • The Fix: A single CLAUDE.md file, which Claude automatically reads, provides rules to curb these behaviors. No code changes are required.
  • Use Cases: Most beneficial for high-volume automation pipelines, structured tasks, and teams needing consistent, parseable output, where the token savings offset the CLAUDE.md file's own input token consumption.
  • Not For: Short, casual queries, fixing deep AI failure modes (like hallucinations), or exploratory work where detailed debate and pushback are desired.
  • Benchmark Results: Show an average 63% reduction in word count across various tasks (e.g., explaining concepts, code review), translating to considerable token savings for high-volume users.
  • What It Fixes: Specific rules target sycophantic openers, hollow closings, prompt restatement, non-ASCII characters, 'As an AI...' framing, unnecessary disclaimers, scope creep, and more, enforcing concise, direct responses.
  • Pro Tips: Users can compose multiple CLAUDE.md files (global, project-level, subdirectory-level) for fine-grained control and tailor rules to specific failure modes.

This project provides a practical approach to making Claude's output more efficient and focused, improving developer experience and potentially reducing costs for specific, high-usage scenarios.

The Gossip

The Token Tug-of-War: Cost vs. Quality

Many users debate whether trimming output tokens, especially 'sycophantic' fluff, genuinely saves money or improves model quality. Some point out that input tokens are the primary cost driver for Claude, rendering output savings minor. Others argue that despite lower volume, output tokens are significantly more expensive, making optimization valuable. A key concern is that reduced verbosity, particularly forcing answers before reasoning, might degrade the model's 'thinking' process, leading to less coherent or accurate results, citing Karpathy's observations on LLM reasoning.

Beyond Output: Holistic Context Management

While `CLAUDE.md` focuses on output, commenters highlight that effective token management is broader. Discussions include alternative tools like `Headroom` (context compression), `RTK` (CLI output compression), and `MemStack` (persistent memory) that target input tokens and context. There's also discussion about the utility of verbose output for maintaining 'coherence' in long-running agentic tasks, suggesting that some 'redundant' information might be beneficial for the model's internal state and even improve outcomes, contrary to the `CLAUDE.md` philosophy.

The AI's Peculiar Persona: Taming the Tendencies

The `CLAUDE.md` project directly addresses common frustrations with Claude's default behaviors: sycophantic greetings, unnecessary disclaimers, restating questions, and overly verbose code. Commenters acknowledge these issues, some expressing surprise that LLMs can be so effectively guided by simple `.md` files. There's a shared sentiment that taming these 'annoying responses' improves the user experience and workflow, even if the token savings aren't monumental, transforming Claude from a chatty assistant to a more focused tool.