HN
Today

Show HN: State of the Art of Coding Models, According to Hacker News Commenters

Curious about which AI coding assistants dominate the Hacker News zeitgeist? This user built an LLM-powered pipeline to scour comments for model mentions and sentiment, offering a real-time, community-driven 'State of the Art' report. It's an AI-on-AI analysis that's sparked lively debate about methodology and the true performance of popular coding models.

63
Score
32
Comments
#10
Highest Rank
12h
on Front Page
First Seen
May 2, 10:00 PM
Last Seen
May 3, 9:00 AM
Rank Over Time
101112101115181821161416

The Lowdown

Feeling disconnected from the rapidly evolving landscape of AI coding assistants, the author developed an automated system to analyze Hacker News comments for insights into model popularity and sentiment. This 'Show HN' project aims to provide a dynamic, community-sourced perspective on the current state of these tools.

  • The pipeline daily processes the 200 most popular HN posts from a 24-hour window.
  • An initial LLM filters these posts to identify those relevant to LLMs or coding.
  • Gemini is then used to identify specific coding models mentioned in comments (from the OpenRouter list) and assess the sentiment towards each.
  • All results are logged to a publicly accessible Google Sheet, allowing users to audit the process and examine individual comment sentiments.
  • The project provides a unique, real-world data point on user experiences, complementing traditional benchmarks.

By leveraging AI to understand discussions about AI, the project offers a timely snapshot of user preferences and perceived performance of coding models, focusing on practical, crowd-sourced intelligence.

The Gossip

Model Musings & Sentiment Skepticism

Commenters dissect the actual capabilities and perceived sentiment of various AI coding models. There's a debate about specific model strengths (e.g., GPT for code, Claude for text) and weaknesses (Gemini's unreliability, Claude's API issues), often based on personal experience. Many also question the methodology's accuracy in capturing nuanced sentiment, highlighting the challenge of truly understanding user experience from comments alone.

Open-Source's Ascendance & Economic Edge

A strong undercurrent in the discussion champions open-source models like Qwen and DeepSeek. Users laud their cost-effectiveness, the avoidance of vendor lock-in, and the growing potential for local, GPU-powered operations. The conversation also touches on geopolitical implications and alleged 'smear campaigns' against non-Western models, contrasting them with the transparency demands for fully open projects.

Feedback Fervor & Methodological Makeover

The HN community, true to form, deluged the author with constructive criticism and feature requests. Feedback ranged from improving data visualization (fixing unreadable graph labels) and refining the sentiment analysis methodology (e.g., tracking explicit model comparisons, historical trends) to broader critiques of the project's 'State of the Art' claim. The author actively engaged, quickly implementing some UI fixes and acknowledging future improvements.