Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go
Antfly is a new distributed search engine and document database built in Go, offering multimodal search (text, images, audio, video) with integrated ML inference and RAG capabilities. This "Show HN" presents a robust, scalable system featuring Raft consensus, ACID transactions, and a single-binary deployment for ease of use. Its pragmatic Elastic License v2, while potentially contentious, aims for sustainability while keeping the source open for modification and self-hosting, appealing to developers keen on powerful, self-managed infrastructure with AI at its core.
The Lowdown
Antfly is a newly unveiled distributed document database and search engine, crafted in Go, that tackles modern data challenges by integrating full-text, vector, and graph search capabilities. Designed for multimodal data, it uniquely features native machine learning inference, empowering developers to build advanced retrieval-augmented generation (RAG) systems without external API dependencies. This "Show HN" highlights a comprehensive solution for scalable, intelligent data management.
- Multimodal & Hybrid Search: It indexes and searches text, images, audio, and video, combining full-text (BM25), dense, and sparse vector (SPLADE) search in a single query.
- Native ML Inference (Termite): Antfly includes Termite, a built-in service for ML tasks like embeddings, reranking, chunking, and text generation, reducing reliance on external ML APIs. It also supports major LLM providers.
- Integrated RAG Agents: The system offers built-in RAG agents with streaming, multi-turn chat, tool calling (web search, graph traversal), and confidence scoring.
- Graph Indexes: Automatic relationship extraction and graph traversal queries are supported, turning data into connected knowledge graphs.
- Distributed Architecture: Built on
etcd's Raft library and backed by Pebble, it features a multi-Raft design for metadata and data shards, ensuring robust distributed consensus, sharding, and replication. - Robustness & Verification: Its distributed protocols are rigorously tested with end-to-end chaos tests (inspired by Jepsen) and formally verified using TLA+.
- Single-Binary Deployment:
antfly swarmprovides an all-in-one process for local development and small deployments, with easy scaling to multi-node clusters. - Ecosystem & Tooling: Antfly ships with a Kubernetes operator, an MCP server for LLM tool use, and various SDKs in Go, TypeScript, Python, and React, alongside a PostgreSQL extension (
pgaf). - Licensing: The core server uses the Elastic License v2, permitting use, modification, and self-hosting but restricting managed service offerings, while client libraries are Apache 2.0.
Antfly presents itself as a powerful, self-hostable alternative for developers needing a robust, distributed search and data store with deeply integrated AI and multimodal processing. Its thoughtful engineering, from TLA+ verification to a comprehensive ecosystem, positions it as a significant tool for building next-generation intelligent applications, balancing open access with sustainability through its licensing model.