Epicycles All the Way Down
This essay critically examines the fundamental nature of Large Language Models, arguing they are sophisticated pattern-fitters rather than true understanding engines, despite their remarkable capabilities. It delves into the 'epicycles' problem, where incremental fixes are applied without addressing core generative limitations, leading to unique failure modes. The piece offers a nuanced perspective on AI intelligence and consciousness, making it a compelling read for those interested in the philosophical and technical underpinnings of LLMs.
The Lowdown
Rohit Krishnan's "Epicycles All The Way Down" explores the inherent nature of Large Language Models, positing that despite their extraordinary success, they operate primarily as highly effective pattern-fitters rather than systems with genuine understanding. Drawing on personal anecdotes, philosophical concepts, and mathematical theorems, the essay unpacks why LLMs often achieve impressive results through approximation, yet struggle with fundamental generative principles.
- The author opens with personal experiences, like re-deriving formulas for exams and a failed poker strategy based on intuition over calculation, to illustrate the difference between understanding and pure pattern matching.
- LLMs are characterized as "over-fit pattern-fitters" that continuously add "epicycles" (ad-hoc adjustments) to achieve performance, rather than uncovering the underlying generative rules.
- The essay references Gold's theorem, suggesting that training solely on positive examples can lead LLMs to infer a program that fits the data, but not necessarily the intended true program.
- It highlights a distinction where success is "low-dimensional" (few ways to be right) but failure is "high-dimensional" (infinite ways to be wrong), a space LLMs uniquely explore.
- The concept of "understanding as compression" (Eric Baum) and high-dimensional memory (Kanerva) is presented as a compelling theory for how LLMs function, storing thoughts and sensations as coordinates in space.
- The author's own experiments with transformers attempting to learn Elementary Cellular Automata and wave functions demonstrate how models approximate rather than learning fundamental rules.
- Reasoning in LLMs is framed as a form of search over latent programs, where models are taught to think step-by-step, but this still represents learning patterns of reasoning, not necessarily true generative understanding.
- The essay challenges simplistic notions of LLM consciousness, suggesting their intelligence is more akin to a superintelligent market or swarm, lacking continuous, subjective experience.
- It proposes that successful AI alignment will resemble managing other complex, super-intelligent systems like the economy, involving extensive rules, supervision, and co-evolution.
The core argument concludes that while LLMs are incredibly capable machines born from their evolutionary data, they mostly act like water flowing downhill, pulled towards patterns rather than discovering parsimonious, foundational laws. While scale helps, it may not fundamentally shift this paradigm, indicating that LLMs may always excel at tasks humans can do, but lack the adaptability for tasks humans invent.