If Dspy is so great, why isn't anyone using it?
DSPy, an AI framework, grapples with low adoption despite offering robust engineering patterns for LLM systems. The author contends that developers often end up painfully reinventing these patterns poorly, underscoring the framework's architectural foresight. This article resonates with HN readers by validating their struggles in building complex LLM applications and offering principled solutions for maintainability and scalability.
The Lowdown
The article, "If Dspy is so great, why isn't anyone using it?", delves into the paradox of DSPy, an AI framework praised by its users for tackling complex AI engineering challenges, yet struggling with widespread adoption. Author Skylar Payne contends that while DSPy's abstractions might initially seem hard or unfamiliar, the problems it solves are inevitable for any sufficiently complex AI system, leading many developers to inadvertently build "worse versions of Dspy."
- DSPy has low adoption compared to frameworks like LangChain, despite offering benefits like faster model testing, maintainability, and focus on context over plumbing.
- The core issue is DSPy's unfamiliar abstractions, which require a different way of thinking, contrasting with the immediate need to "make the pain go away."
- Payne introduces "Khattab's Law," stating that complex AI systems eventually contain an ad hoc, bug-ridden implementation of half of DSPy.
- He illustrates the typical evolution of an AI system through seven stages: from a simple API call to advanced features like prompt management, structured outputs, retries, RAG integration, robust evaluation, and model agnosticism.
- The conclusion for this evolutionary journey is that developers often inadvertently "built a worse version of Dspy" by custom-implementing these patterns.
- DSPy packages fundamental software engineering principles for AI: Signatures (typed I/O), Modules (composable units), and Optimizers (prompt improvement logic).
- Good engineers often write "bad AI code" due to weird feedback loops (probabilistic output, hard to debug), pressure to ship (prioritizing immediate results over clean architecture), and unclear boundaries (prompts as both code and data).
- Two options are presented: Use DSPy and embrace the learning curve, or "steal the ideas" by proactively implementing its patterns from day one: typed I/O, prompt separation, composable units, early eval infrastructure, and abstract model calls.
The article's central message is that DSPy's patterns are not optional for sophisticated AI systems. Whether developers choose to adopt DSPy directly or implement its foundational concepts independently, understanding its purpose is crucial for building robust, maintainable, and scalable LLM applications.
The Gossip
Dearth of DSPy's Deep Dive
Many commentators echoed the article's premise that DSPy isn't widely used, offering reasons ranging from simple lack of awareness ("Never heard of it") to perceived difficulty and ergonomic issues. Critics highlighted the significant upfront work required for evaluation datasets, the framework's strict commitment, and challenges with prompt extraction and dynamic typing in mature codebases. The author acknowledged many of these points as valid gripes or misunderstandings about DSPy's immediate benefits.
Contentious Claims and Consulting Concerns
Some readers expressed skepticism about the article's true intentions, initially perceiving it as a veiled advertisement for consulting services or dismissing DSPy's core ideas as mere marketing hype. A specific point of contention arose regarding the suggestion of using a database for prompt management, which some interpreted as a sign of poor CI/CD practices rather than a need for rapid iteration, prompting the author to clarify and accept feedback.
Inevitable Implementations
A significant portion of the discussion affirmed the author's central argument: even if not using DSPy directly, teams inevitably build custom, less robust versions of its core patterns. Commenters related to the "great engineers write bad AI code" sentiment and agreed that proper evaluation metrics are crucial but often difficult to establish early on. This suggests a recognition of the underlying architectural problems DSPy aims to solve, even among those who haven't adopted the framework.