Codex logging bug may write TBs to local SSDs

A serious bug in OpenAI's Codex application is causing it to log terabytes of data to users' local SSDs annually, rapidly consuming their write endurance and potentially destroying hardware. This revelation has ignited a fervent discussion on Hacker News, questioning the software quality practices of major AI developers and the implications of 'vibe coding'. The community is debating whether AI-generated code is inherently more prone to such critical failures and highlights the consequences of lax oversight in developing widely distributed applications.

Score

Comments

Highest Rank

10h

on Front Page

First Seen

Jun 22, 8:00 AM

Last Seen

Jun 22, 5:00 PM

Rank Over Time

The Lowdown

OpenAI's Codex application has been identified with a critical bug that causes it to write an exorbitant amount of data to local SQLite feedback logs, potentially leading to the premature failure of users' solid-state drives (SSDs). This issue, which extrapolates to roughly 640 TB of writes per year, significantly exceeds the warranted write endurance of many consumer SSDs, effectively turning a software bug into a hardware destroyer.

Excessive Logging: Codex continuously writes to ~/.codex/logs_2.sqlite and related files, with one user observing 37 TB written over 21 days.
SSD Endurance Risk: This write volume translates to hundreds of full-drive writes annually, far surpassing the rated endurance (TBW) of many SSDs.
Root Cause: The primary culprit is the SQLite feedback log sink being installed with a global TRACE default, capturing low-value dependency logs and large raw protocol payloads.
Write Amplification: Despite a relatively stable retained log size, continuous insert-and-prune operations cause significant write amplification, with tens of thousands of rows inserted every 15 seconds.
Log Content Analysis: Analysis shows TRACE logs (70.7%) and mirrored OpenTelemetry events (25.3%) are responsible for the vast majority of retained log bytes.
Proposed Solutions: The suggested fixes include narrowing logging thresholds, filtering out dependency noise, avoiding full payload persistence, and implementing global log size/write caps.
Historical Context: This isn't an isolated incident, with several related GitHub issues dating back over a year detailing similar excessive logging and performance problems.

The severity of this bug from a leading AI company raises significant concerns about software development practices, quality assurance, and the potential impact of seemingly minor oversights when magnified by widespread deployment.

The Gossip

Slopware Scrutiny

Many commenters lambasted OpenAI for what they perceive as 'slopware' or 'vibe-coded' software, criticizing the company's apparent lack of quality control given its resources and AI capabilities. There's a prevailing sentiment that a bug of this magnitude, capable of destroying hardware, reflects deeply flawed development and review processes. Some point to other OpenAI products or competitors like Claude for similar issues, suggesting a broader trend of low-quality software from AI-focused companies.

Human vs. AI Code Quality

A significant thread of discussion revolved around the implications for AI-generated code. Some argued that while humans make mistakes, they learn from them, whereas AI-generated code might repeatedly produce similar issues if not properly supervised or trained. Others countered that the problem often lies in human decisions (e.g., setting log levels) and supervision, or that AI models do improve over time, albeit in aggregate rather than from individual errors. The debate touched upon 'comprehension debt' and the challenges of reviewing AI-produced code that can be convincingly wrong.

Mitigation and Monitoring Mishaps

Commenters offered immediate workarounds, such as using SQLite triggers to block inserts or employing RAM-backed `tmpfs` to prevent SSD wear, while acknowledging these don't fix the root problem. There was also strong criticism regarding OpenAI's delayed response to this and similar long-standing issues, questioning the effectiveness of their internal monitoring and review processes for critical bugs, especially given the potential for AI tools to assist in such tasks.