DeepMind AlphaProof Nexus: AI Solves Long-Standing Erdős Math Problems
Generated: 2026-06-06 · API: Gemini 2.5 Flash · Modes: Summary
DeepMind AlphaProof Nexus: AI Solves Long-Standing Erdős Math Problems
Clip title: DeepMind’s New AI Found A Strange New Way To Think Author / channel: Two Minute Papers URL: https://www.youtube.com/watch?v=Dkqzqw8rxXI
Summary
This video discusses DeepMind’s recent breakthrough with its AI system, AlphaProof Nexus, in solving long-standing mathematical problems. The system successfully tackled 9 out of 353 attempted Erdős problems, which are decades-old open mathematical conjectures left by the legendary Hungarian mathematician Paul Erdős. Despite a high “failure rate” of 97.5% on the attempted problems, the video emphasizes that solving even a few of these problems, some of which have remained unsolved for 56 years, for a mere couple of hundred dollars per problem, is an “incredibly good” and significant achievement in mathematics.
The video addresses common criticisms that these AI solutions aren’t “fundamentally new.” It counters this by presenting a timeline of AI’s progress in mathematics: from GPT-3 struggling with basic addition four years ago, to systems grappling with high school math competitions two years ago, winning mathematical Olympiad gold a year ago, and now, solving 50-year-old open problems. This rapid advancement highlights an exponential growth curve, which the presenter refers to as “The First Law of Papers” – suggesting that current limitations should not overshadow future potential, as AI capabilities are expanding rapidly.
The core innovation behind AlphaProof Nexus lies in its “tournament system.” A mathematician first formalizes an open problem in Lean, a mathematical proof language. Then, a Prover Subagent (an LLM combined with AlphaProof) attempts to construct a proof. This proof is then checked by a Proof Validator, which, if it finds errors, not only rejects the proof but also explains why it’s incorrect. Crucially, a cheaper Rater Subagent (another AI) evaluates multiple proposed solutions, assigning them ELO scores, much like a chess ranking system. This subagent selects the “better” solution from a pair, even if both are technically incorrect, based on its perceived quality. This iterative process allows the system to continuously evolve and refine proofs, building a “reliable system out of unreliable parts” until a formally verified solution is achieved.
While the system’s achievements are lauded, the video also acknowledges limitations, including a selection bias where only problems easier to formalize were tested (350 out of 1200 total Erdős problems). Furthermore, smaller AI models solved zero problems, indicating that a “beefy AI at the core” is still necessary. The key takeaway from this development is a shift in the philosophy of AI research: instead of solely focusing on making individual AI models smarter, the emphasis is now on creating robust “algorithmic harnesses” or “multi-agent loops” around existing AI models. This approach leverages the strengths of multiple agents and iterative refinement to guide even less reliable AI components towards solving incredibly complex problems with high confidence and verifiable results.
Video Description & Links
Description
❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers
📝 The paper is available here: https://github.com/google-deepmind/alphaproof-nexus-results https://arxiv.org/html/2605.22763v1
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi
My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu
Tags
ai, deepmind