Your File System Is Already A Graph Database

This post argues that your file system, especially an Obsidian vault with wikilinks, is already a functional graph database, needing no special infrastructure beyond an LLM as a query engine. It demonstrates how this setup creates a 'context engineering system' that significantly enhances LLM performance by providing rich, structured input for tasks like drafting design docs or project handoffs. The approach resonated with the HN community for its pragmatic take on leveraging existing tools to solve complex knowledge management problems, steering clear of more heavyweight RAG pipelines.

Score

Comments

Highest Rank

10h

on Front Page

First Seen

Apr 8, 10:00 AM

Last Seen

Apr 8, 7:00 PM

Rank Over Time

The Lowdown

The author, Alex Kessinger, expands on Karpathy's idea of using LLMs to build personal knowledge bases, asserting that a file system, particularly one managed with Obsidian and wikilinks, inherently functions as a graph database. This "no-infrastructure" approach uses files as nodes, wikilinks as semantic edges, and folder structures for taxonomy, with an LLM serving as the natural language query engine.

The system addresses the common problem of reassembling scattered context (Slack threads, meeting notes, documents) for work tasks. Instead of a scramble, this knowledge base provides a systematic way to:<ul><li>Automatically create and link notes after meetings, building timelines for people and accumulating project artifacts.</li><li>Enable an LLM agent to "spider" through related tools and documents, assembling comprehensive context for drafting design documents, vision statements, or analyses.</li><li>Function as a "context engineering system," ensuring the LLM receives rich, historically relevant input, which significantly improves the quality of its output compared to cold prompting.</li></ul>The author acknowledges that automated inbox processing (summarizing, categorizing, and correlating new content) remains a hard problem. However, getting started is simple: create a basic folder structure, consistently link notes, and then direct an LLM to relevant folders for drafting, immediately noticing the improved context. The core insight is that this method allows one's work and knowledge to compound effectively.

The Gossip

Graph vs. Tree: Semantic Squabble

Discussion arose regarding whether a file system truly constitutes a 'graph database' or is merely a constrained tree. Some argued that while wikilinks introduce interconnections, a 'proper' graph implies the use of hard or soft links. Others, however, accepted the author's premise that wikilinks effectively define semantic edges, transforming the file system into a practical graph structure for the purpose described.

Flat Files vs. Folder Foundations: AI's Contextual Conundrum

A significant debate centered on whether LLMs actually *need* a human-designed folder structure. Some commenters suggested that a flat list of files, combined with advanced search mechanisms (like BM25 or grep), could allow an AI to explore and compute context dynamically, potentially saving tokens. Counterarguments highlighted that human-understandable structure is crucial for maintainability, especially during system failures, and for allowing LLMs to leverage pre-derived, higher-level conclusions that are less prone to quality degradation from AI-generated text.

Private Vaults, Public Models: The Local LLM Lag

Concerns about data privacy were prominent, leading many to attempt using local LLMs with their private Obsidian vaults. However, commenters noted that local models often struggle to match the performance of commercial alternatives, even with substantial hardware. The challenge of fine-tuning local models faces a 'catch-22' due to the difficulty of gathering private training data. A related observation was that allowing LLMs to write their own text into knowledge repositories can degrade the overall quality of the context over time.

AI-Assisted Arrangement: The Perennial Pursuit of Order

Many users resonated with the author's struggle to maintain an organized knowledge base. Commenters shared their own experiences using LLMs (or even older tools like spreadsheets) to organize messy hard drives and enforce naming conventions. The general sentiment was that while LLMs offer promising assistance in structuring and managing information, the ultimate goal of consistent, long-term order remains an ongoing challenge.