Artificial intelligence models are starting to solve high-level mathematics problems

Over the weekend, Neil Somani, a software engineer, former quantitative researcher, and startup founder, was testing the mathematical skills of a new OpenAI model when he made an unexpected discovery. After pasting the problem into ChatGPT and letting it think for 15 minutes, it came back to the complete solution. He evaluated and formalized the evidence using a tool called Harmonic, but everything was verified.

“I was curious to establish a baseline of when LLM students are effectively able to solve open-ended math problems compared to where they struggle,” Somani said. The surprise was that with the latest model, the limits started to advance a little.

ChatGPT Thought series It is even more impressive, the first-class mathematical axioms such as Legendre formula, Bertrand’s postulateand Star of David theory. Eventually, the model was found Math Overflow publication from 2013where Harvard mathematician Noam Elkes presented an elegant solution to a similar problem. But ChatGPT’s final proof differed from Elkies’ work in important ways, providing a more complete solution to a version of the problem posed by the legendary mathematician Paul Erdös, whose vast collection of unsolved problems became a proving ground for artificial intelligence.

For anyone who doubts machine intelligence, it’s a surprising finding, and it’s not the only one. AI tools have become ubiquitous in mathematics, from formal-oriented LLMs like Harmonic’s Aristotle to literature review tools like OpenAI’s Deep Search. But since the release of GPT 5.2 — which Somani described as “much more adept at mathematical reasoning than previous iterations” — the sheer volume of problems solved has become harder to ignore, raising new questions about the ability of large language models to push the frontiers of human knowledge.

Somany was looking at the Erdös Problems, a collection of more than a thousand conjectures by the Hungarian mathematician that are considered… Keep online. Problems have become a tempting target for AI mathematics, and vary widely in both subject matter and difficulty. The first batch of standalone solutions came in November of A Gemini-powered model called AlphaEvolve But recently, Somani and others found that GPT 5.2 is remarkably adept at high-level mathematics.

Since Christmas, 15 problems have been moved from “open” to “solved” on the Erdős website – and 11 solutions specifically credited the AI models as being involved in the process.

The esteemed mathematician Terence Tao has a more nuanced view of progress On his GitHub pagecounts eight different problems where AI models have made independent, meaningful progress on the Erdos problem, with a further six cases where progress has been made by locating and building on previous research. There is still a long way to go until AI systems can perform calculations without human intervention, but there is clearly an important role for large models to play.

TechCrunch event

San Francisco
|
October 13-15, 2026

On MastodonTao predicted that the scalable nature of AI systems makes them “more suitable for systematic application to the ‘long tail’ of arcane Erdös problems, many of which already have clear and straightforward solutions.”

“As such, many of these easier Erdos problems are now more likely to be solved by purely AI-based methods rather than human or hybrid means,” Tao continued.

Another driving force is the recent shift toward formalization, a labor-intensive task that makes verifying and extending mathematical reasoning easier. Formalization does not require the use of artificial intelligence or even computers, but a new set of automated tools has made the process much easier. The open-source Lean “proof assistant,” developed at Microsoft Research in 2013, has become widely used in the field as a way to formalize proof — and AI tools like Harmonic’s Aristotle promise to automate much of the formalization work.

For harmonica founder Theodor Achim, the sudden jump in Erdös problems solved is less important than the fact that the world’s greatest mathematicians are starting to take these tools seriously. “I care more about the fact that math and computer science professors are using (AI tools),” Achim said. “These people have a reputation to protect, so when they say they use Aristotle or they use ChatGPT, that’s real evidence.”

Leave a ReplyCancel Reply