HN
Today

What Claude Code Chooses

A new study reveals that Anthropic's Claude Code models, when given open-ended prompts, prefer building custom solutions over adopting existing tools, while still decisively picking popular options like GitHub Actions when tools are chosen. This deep dive into AI's practical decision-making highlights its potential influence on future tech stacks and sparks discussion on LLM bias and developer interaction strategies. The findings raise intriguing questions about how AI will shape the "build vs. buy" dilemma in software development.

34
Score
16
Comments
#5
Highest Rank
4h
on Front Page
First Seen
Feb 26, 7:00 PM
Last Seen
Feb 26, 10:00 PM
Rank Over Time
23865

The Lowdown

A research study by Amplifying.AI investigated "What Claude Code Actually Chooses" by pointing three Claude models (Sonnet 4.5, Opus 4.5, Opus 4.6) at real code repositories 2,430 times, asking open-ended questions without naming specific tools. The goal was to observe the AI's natural inclination in software development decisions across 20 tool categories and 4 project types. The study yielded an 85.3% extraction rate of choices.

  • Build, Don't Buy: The most significant finding was Claude Code's strong preference for custom/DIY solutions. In 12 out of 20 categories, it opted to build custom code (e.g., config systems for feature flags, JWT + bcrypt for Python auth, in-memory TTL wrappers for caching) rather than recommending established tools.
  • Decisive Tool Picks: When Claude Code does choose a tool, it picks with high conviction and consistency, favoring widely adopted options such as GitHub Actions (94%), Stripe (91%), and shadcn/ui (90%).
  • Default Stack Formation: The models consistently recommend certain tools, particularly within the JS ecosystem, establishing a "default stack" that includes Zustand for state management and Sentry for observability.
  • Generational Shifts and Recency Gradient: Newer Claude models show a clear preference for newer tools (e.g., Drizzle over Prisma for JS ORM) and sometimes move away from established solutions like Celery or even Redis in specific contexts.
  • Deployment Split: Deployment choices were starkly divided by stack: Vercel dominated for JS frontend deployments, while Railway was preferred for Python backends, with traditional cloud providers like AWS, GCP, and Azure receiving zero primary picks.
  • Model Disagreement: While models generally agreed across 18 categories, 5 categories (JS ORM, JS Jobs, Python Jobs, Caching, Real-time) showed notable shifts or cross-language disagreement between Sonnet 4.5, Opus 4.5, and Opus 4.6.

Overall, the research demonstrates Claude Code's tendency to create bespoke solutions when given the latitude, potentially influencing the adoption curve of tools by either sidelining mature options or solidifying the position of current leaders. The study offers a glimpse into the evolving decision-making processes of advanced AI coding assistants.

The Gossip

Algorithmic Allegiance: LLM Bias & Influence

Commenters extensively discussed the potential for LLMs to exhibit bias in their recommendations, whether intentional or not. The idea of "invisible advertising" or a "conflict of interest" was raised, with some speculating that models might subtly favor certain tech stacks or providers (e.g., Google's Gemini favoring GCP). Others humorously noted how different tiers of Claude might recommend different tools, highlighting a perceived commercial angle.

Prompting Prowess: Guiding AI Choices

Many developers shared their personal strategies for interacting with LLMs, contrasting with the study's "no tools named" approach. They emphasized the importance of explicitly specifying desired libraries, frameworks, or architectural patterns in prompts to guide the AI's output. Some argued that experienced developers *should* be capable of directing the model, using separate context windows to inquire about architectural choices before making a decision.

Specific Software Surprises: Tooling Reactions

The community reacted to several specific tool choices (or lack thereof) made by Claude. The shift from Redis to custom caching solutions, particularly in newer models, sparked curiosity about the underlying reasons. Claude's preference for building custom feature flag systems over recommending SaaS solutions like LaunchDarkly was met with approval by some who dislike feature flag services. There was also surprise regarding the absence of React in some summary mentions and the dominance of specific deployment platforms like Vercel and Railway, and how LLMs might 'keep React alive'.