HN
Today

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Antfly is a new distributed search engine and document database built in Go, offering multimodal search (text, images, audio, video) with integrated ML inference and RAG capabilities. This "Show HN" presents a robust, scalable system featuring Raft consensus, ACID transactions, and a single-binary deployment for ease of use. Its pragmatic Elastic License v2, while potentially contentious, aims for sustainability while keeping the source open for modification and self-hosting, appealing to developers keen on powerful, self-managed infrastructure with AI at its core.

15
Score
2
Comments
#8
Highest Rank
4h
on Front Page
First Seen
Mar 17, 4:00 PM
Last Seen
Mar 17, 7:00 PM
Rank Over Time
981117

The Lowdown

Antfly is a newly unveiled distributed document database and search engine, crafted in Go, that tackles modern data challenges by integrating full-text, vector, and graph search capabilities. Designed for multimodal data, it uniquely features native machine learning inference, empowering developers to build advanced retrieval-augmented generation (RAG) systems without external API dependencies. This "Show HN" highlights a comprehensive solution for scalable, intelligent data management.

  • Multimodal & Hybrid Search: It indexes and searches text, images, audio, and video, combining full-text (BM25), dense, and sparse vector (SPLADE) search in a single query.
  • Native ML Inference (Termite): Antfly includes Termite, a built-in service for ML tasks like embeddings, reranking, chunking, and text generation, reducing reliance on external ML APIs. It also supports major LLM providers.
  • Integrated RAG Agents: The system offers built-in RAG agents with streaming, multi-turn chat, tool calling (web search, graph traversal), and confidence scoring.
  • Graph Indexes: Automatic relationship extraction and graph traversal queries are supported, turning data into connected knowledge graphs.
  • Distributed Architecture: Built on etcd's Raft library and backed by Pebble, it features a multi-Raft design for metadata and data shards, ensuring robust distributed consensus, sharding, and replication.
  • Robustness & Verification: Its distributed protocols are rigorously tested with end-to-end chaos tests (inspired by Jepsen) and formally verified using TLA+.
  • Single-Binary Deployment: antfly swarm provides an all-in-one process for local development and small deployments, with easy scaling to multi-node clusters.
  • Ecosystem & Tooling: Antfly ships with a Kubernetes operator, an MCP server for LLM tool use, and various SDKs in Go, TypeScript, Python, and React, alongside a PostgreSQL extension (pgaf).
  • Licensing: The core server uses the Elastic License v2, permitting use, modification, and self-hosting but restricting managed service offerings, while client libraries are Apache 2.0.

Antfly presents itself as a powerful, self-hostable alternative for developers needing a robust, distributed search and data store with deeply integrated AI and multimodal processing. Its thoughtful engineering, from TLA+ verification to a comprehensive ecosystem, positions it as a significant tool for building next-generation intelligent applications, balancing open access with sustainability through its licensing model.