Maxproof
MaxProof introduces a novel AI framework that combines generative-verifier reinforcement learning with population-level test-time scaling to tackle competition-level mathematical proofs. This system, which integrates proof generation, verification, and repair, has achieved remarkable success, surpassing human gold-medal thresholds on both the IMO 2025 and USAMO 2026. Its ability to solve complex mathematical problems at such a high level makes it a compelling read for the Hacker News community interested in advanced AI capabilities and automated reasoning.
The Lowdown
MaxProof is a groundbreaking AI system designed to push the boundaries of automated mathematical proof, particularly for competition-level problems like those found in the MiniMax-M3 series. It leverages a sophisticated approach called generative-verifier reinforcement learning, complemented by a population-level test-time scaling framework, to achieve unprecedented performance in mathematical reasoning.
- The core of MaxProof involves training three distinct, proof-oriented capabilities: proof generation, robust proof verification, and critique-conditioned proof repair.
- These capabilities are integrated into a single M3 model, engineered with a "defense-in-depth" generative verifier to ensure a very low false-positive rate.
- During the test phase, MaxProof operates by treating the M3 model as a multifaceted tool—a generator, verifier, refiner, and ranker.
- It then searches through a diverse population of candidate proofs, employing tournament selection to identify and return the optimal final proof.
- This innovative test-time scaling methodology enabled MaxProof's M3 model to achieve a score of 35/42 on IMO 2025 and 36/42 on USAMO 2026, results that exceed the human gold-medal threshold for both prestigious competitions.
MaxProof represents a significant advancement in artificial intelligence's capacity for complex mathematical reasoning, demonstrating an AI's ability to not only generate proofs but also critically evaluate and refine them to a human gold-medal standard.