Show HN: I built a tiny LLM to demystify how language models work

GuppyLM is a minimalist language model designed to make the inner workings of LLMs transparent and accessible. Its creator built this small-scale model to demonstrate that fundamental LLM construction doesn't require PhD-level knowledge or vast computational power, aiming to demystify complex concepts through practical application.

Tiny Scale, Big Purpose: GuppyLM is an 8.7 million parameter model built on a vanilla transformer architecture, intentionally simplified to avoid advanced features like GQA or RoPE.
Rapid Training: It can be trained from scratch in about five minutes on a free Google Colab T4 GPU, making the entire process highly approachable for experimentation.
Unique Personality: The model is trained to embody a "guppy" personality, speaking in short, lowercase sentences about a fish's world (water, food, tank life), and intentionally avoids human abstractions.
Synthetic Data: Training relies on 60,000 synthetic conversations across 60 topics, ensuring a consistent and focused personality for the fish character.
Educational Accessibility: The project provides Colab notebooks for both chatting with the pre-trained GuppyLM and training a new instance, offering a full pipeline from raw text to trained output.
Design Rationale: Key design choices include omitting a system prompt (personality is baked in), focusing on single-turn conversations due to a limited context window, and using a vanilla transformer for simplicity and clarity.

In essence, GuppyLM is not about achieving state-of-the-art performance but rather about providing a tangible, reproducible example of an LLM that empowers users to understand the core components and principles behind larger, more complex models.

Show HN: I built a tiny LLM to demystify how language models work

The Lowdown