A few words on DS4
Antirez, the creator of Redis, has quickly released DS4, a local AI project leveraging DeepSeek v4 Flash that offers frontier-model performance on consumer hardware. This open-source endeavor is rapidly gaining traction for enabling powerful local inference for serious work, effectively rivaling cloud-based GPT/Claude models. Hacker News is abuzz with users surprised by its capability to handle long contexts and complex tasks on machines with 128GB of RAM.
The Lowdown
Antirez, the renowned creator of Redis, has unveiled DS4, a new local AI project built around the DeepSeek v4 Flash model. This project has rapidly gained popularity due to its ability to bring near-frontier model performance to consumer-grade hardware, providing a compelling local alternative to powerful cloud-based LLMs like Claude or GPT. Antirez notes the confluence of factors that made this possible, including the new model's efficiency and the accumulated experience of the local AI movement.
- DS4 utilizes an "extremely asymmetric quants recipe of 2/8 bit" which allows it to run efficiently.
- It requires 96-128GB of RAM, making it accessible on high-end Macs or specialized "GPU in a box" setups.
- The author developed DS4 in about a week, drawing on existing local AI knowledge and the new DeepSeek model.
- Antirez states it's the first time he's found a local model capable of "serious stuff" he'd typically ask cloud LLMs.
- Future plans include quality benchmarks, integrating a coding agent, setting up a hardware CI test environment, more ports, and distributed inference.
- The underlying model is designed to be swappable, ensuring DS4 adapts to the "best current open weights model that is practically fast."
Antirez emphasizes that AI is too critical to remain solely a provided service, underscoring the importance of powerful, locally run models like DS4.
The Gossip
Astonishing AI Ascent
Users are highly impressed by DS4's capabilities, particularly its performance on commodity hardware. Many report running it successfully on Apple Silicon Macs with 128GB of RAM, noting its "incredible" speed and ability to handle "long context reasoning" (up to 124k tokens) without degradation, a feat they haven't seen in many frontier models. Its proficiency in tasks like code generation and tool execution is frequently highlighted, leading to excitement about its potential as a robust local AI solution.