Automatic Textbook Formalization

RepoProver is a pioneering research codebase from Facebook Research aimed at the automatic formalization of mathematics textbooks. It employs a sophisticated multi-agent scaffold that orchestrates large language model (LLM) agents to translate, prove, and review mathematical content within the Lean theorem prover. This system aims to significantly reduce the manual effort involved in creating rigorous formal proofs for complex mathematical texts.

Multi-Agent Architecture: RepoProver utilizes a collaborative system of LLM agents: 'sketchers' translate definitions and theorem statements from LaTeX, 'provers' construct formal proofs, and 'reviewers' ensure quality through a pull request-like mechanism.
Collaboration and Version Control: Agents coordinate their work using a lightweight, file-system-based issue tracker and manage code integration via a merge queue, ensuring the Lean project's main branch remains consistently functional and buildable.
Demonstrated Success: The project successfully formalized the graduate textbook 'Algebraic Combinatorics' by Darij Grinberg, showcasing its capability to handle a substantial mathematical work.
Setup Requirements: Users need Python 3.10+, a Lean project integrated with Mathlib, organized LaTeX source files, a CONTENTS.md for structural documentation, and a manifest.json detailing target theorems and definitions. Git initialization and an empty issues/ directory are also necessary.
Usage and Scalability: The system can be run locally using a coordinator script or distributed across multiple machines via SLURM for large-scale formalization tasks, with configurable agent pools and parallel processing.
Analysis and Debugging Tools: Included scripts allow for token usage breakdown, agent efficiency plotting, and a web-based trajectory viewer to inspect agent actions during a run.

By combining the power of LLMs with structured collaborative workflows and formal verification tools, RepoProver represents a significant leap towards automating mathematical rigor and accelerating the development of formally verified knowledge bases.

Automatic Textbook Formalization

The Lowdown