Google DeepMind has possibly created the most powerful artificial intelligence to date for solving complex mathematical problems. They introduced Aletheia — a new agent that set a record by achieving 91.9% on the IMO-ProofBench Advanced test. This is one of the most rigorous and publicly available proof exams in the style of Meznar, designed to assess mathematical logic.
Aletheia is built on the Gemini Deep Think engine, and its approach to problem-solving involves three sequential stages: first, generating potential solutions; second, verifying them; and finally, correcting errors. Interestingly, even compared to the most advanced version of Gemini Deep Think Advanced, this agent demonstrates higher performance with lower computational resource requirements.
Beyond test results, the model has already tackled several significant tasks: successfully solving four so-called open problems from Erdős’s list — one of which appears to have remained unsolved in literature until now; independently writing a scientific paper with correct mathematical conclusions; and acting as an assistant by collaborating with mathematicians to help prepare complex scientific publications.
Another notable point is that Google emphasizes Aletheia confirms an important principle of scalability. Even in such a challenging field as proof mathematics, the quality of work continues to grow predictably thanks to the correct architecture of the agent. Moreover, utilizing more intelligent data processing cycles enables better results with fewer resources — a fundamentally important aspect for future research.
Source: deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think/.
Created with n8n:
https://cutt.ly/n8n
Created with syllaby:
https://cutt.ly/syllaby
