Show HN: I am building a map of people who lived in the Roman Empire
An individual without formal classical or web development training leveraged AI to create a comprehensive map of ordinary people from the Roman Empire, extracting names from over half a million Latin inscriptions. The project utilized a novel 'AI supervised AI extraction' pipeline, even having higher-level AI fine-tune prompts for lower-level models, showcasing a highly technical approach to historical data. This "Show HN" demonstrates how personal initiative combined with advanced technology can tackle ambitious research gaps, aligning perfectly with the community's appreciation for ingenuity and practical problem-solving.
The Lowdown
Driven by a personal curiosity about the sheer number of ordinary people known from the Roman era, a lone developer embarked on an ambitious project to map their names. Dissatisfied with existing classical databases that focused on officials or were regionally siloed, the creator, leveraging modern AI tools, built a novel system to extract and visualize names from a vast corpus of Latin inscriptions.
- The project's genesis was a desire to identify common Roman citizens, a niche not fully covered by specialized academic databases like Trismegistos or LIRE.
- A robust pipeline was developed to process over 500,000 Latin inscriptions from the Epigraphic Database Clauss-Slaby (EDCS), extracting individuals' names and associated data.
- A key innovation involves using high-end Large Language Models (LLMs) such as Claude, Gemini Pro, or Sonnet to supervise and fine-tune the name extraction process, notably employing "higher level AI [to] tune the prompt for the lower level AI."
- The extraction process achieves F1 scores between 0.64 and 0.87, with the developer aiming for an error rate of less than 1-2% in smaller samples.
- Significant findings include the time-saving benefits of AI-supervised AI extraction and a notable 10 F1-point improvement when models processed raw, uncleaned text containing markers.
- The output is presented on an interactive map visualizing approximately 250,000 inscriptions, allowing users to explore entries, view extracted names (praenomen, nomen, cognomen, status, gender), and access summaries, translations, and original sources.
- Notably, this is a "Show HN" project, highlighting its creation by an individual lacking formal classical or professional web development expertise, relying instead on self-learning and AI assistance.
This "Show HN" project stands as a testament to the power of personal drive and contemporary AI in democratizing complex data analysis. It offers a unique, publicly accessible, and machine-translated window into the lives of ordinary people across the Roman Empire, pushing the boundaries of historical understanding through technological innovation.