HN
Today

A way to exclude sensitive files issue still open for OpenAI Codex

A proposed feature for OpenAI's Codex to exclude sensitive files from AI agents sparks heated debate among developers. While the request aims to provide a dedicated .codexignore mechanism for AI safety and privacy, many argue that traditional system-level security measures are superior and that AI agents should not be relied upon for self-policing. The discussion highlights fundamental disagreements about the security boundaries and responsibilities when integrating AI into development workflows.

25
Score
19
Comments
#1
Highest Rank
5h
on Front Page
First Seen
Jun 28, 1:00 PM
Last Seen
Jun 28, 5:00 PM
Rank Over Time
11446

The Lowdown

An open issue on the OpenAI Codex GitHub repository calls for a mechanism to explicitly exclude sensitive files and paths from AI agents. This .codexignore-like feature would enhance data security and manage relevance for large files when AI tools interact with a codebase.

  • The primary request is for a configurable system—both repo-local and global—to prevent AI agents from reading or transmitting specified files like .env, .pem keys, or SSH credentials.
  • The proposed configuration should be deterministic, easily shareable across development teams, and support user-defined defaults, moving beyond informal conventions.
  • The issue aims to address two critical use cases: protecting sensitive data from exfiltration by the model and optimizing agent performance by ignoring irrelevant or excessively large files.
  • This proposal re-ignites a discussion from a previous, now-closed issue (#205), noting that a comparable feature is still missing in alternative implementations like codex-rs.

The initiative seeks to provide developers with a robust, standardized tool to manage the data accessible to AI agents, addressing long-standing concerns about privacy and security in AI-assisted coding environments.

The Gossip

Security Scrutiny & Sandboxing Solutions

Many commenters vehemently argue that relying on an AI agent to ignore sensitive files is a misplacement of security responsibility. They assert that robust security should be handled at the operating system level through established methods like file permissions (`chmod`), containerization, or sandboxing. This approach ensures that the AI process simply cannot access sensitive data, rather than trusting it to self-censor. Several users point to existing solutions they've built or tools like `greywall.io` that provide external sandboxing.

False Sense of Security & Agent Autonomy

A significant concern raised is that an `.codexignore` file would foster a dangerous 'false sense of security' due to the unpredictable nature of Large Language Models (LLMs). Commenters highlight that even if direct file reads are blocked, an LLM might still inadvertently access sensitive information through tool outputs (e.g., shell commands like `rg`), memory, or by devising clever workarounds. The inherent challenge is that if the application being developed requires access to these files, truly blocking the agent's awareness becomes difficult without breaking functionality. Some also cynically suggest that OpenAI has little incentive to implement features that might limit data collection.

User Experience & Expectations

Some users express a desire for an `.agentignore` or similar file as a convenience feature, expecting the AI agent's harness to intelligently manage what is exposed to the LLM. They argue that the burden of manually setting up complex file permissions or container environments for every AI-assisted task is high and that a simpler, declarative ignore file would greatly improve developer ergonomics. This perspective emphasizes ease of use and a more intuitive interaction with AI development tools, similar to how `.gitignore` functions.