Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

DeepMind claims that artificial intelligence is better than those with gold medals for the International Sports Olympics


It seems that the AI ​​system developed by Google DeepMind, a Gogle’s leading AI research laboratory, has exceeded the average gold medal in solving engineering problems in an international mathematical competition.

The system, called alphageometry2, is an improved version of the system, eternal measurement, This was released by DeepMind last January. in A newly published studyDeep researchers behind Alphageometry2 claim that artificial intelligence can solve 84 % of all engineering problems over the past 25 years in the International Mathematics Olympics (IMO), a mathematics competition for high school students.

Why does Deepmind be interested in a high school mathematics competition? Well, the laboratory believes that the key to artificial intelligence is more capable. Euclidean engineering problems.

Proof of mathematical theories, or a logical explanation of why theoretical (for example, the theory of Pythagora) requires both thinking and the ability to choose from a set of possible steps towards the solution. These problem-solving skills-if the right of Deepmind-is a useful component of artificial intelligence models for general purposes in the future.

In fact, last summer, DeepMind directed the alphageometry2 collection with AlphaPRoof, an Amnesty International Mathematics Logic model, to solve four out of six problems from IMO 2024. In addition to engineering problems, such methods can be extended to other areas of mathematics And science – for example, to help with complex engineering accounts.

Alphageometry2 contains many basic elements, including a Greegle language model from the GOOGLE family of artificial intelligence models and a “symbolic motor”. The Gemini model on the symbolic engine, which uses mathematical rules to infer solutions to problems, helps to access feasible evidence of a specific engineering theory.

A typical engineering plan in IMO.
A typical engineering problem scheme in the IMO exam.Image credits:Google (Opens in a new window)

Olympiad engineering problems depend on the charts that need “structures” to add before they are solved, such as points, lines or circles. The gemini model predicts the alphageometry2 that may be useful to add to a graph, the engine indicates discounts.

Basically, the Gemini alphageometry2 model suggests steps and constructions in an official sporting language of the engine, which – after specific rules – check these steps for logical consistency. The alphageometry2 research algorithm allows multiple inspections of parallel solutions and storing useful results at a common knowledge base.

Alphageometry2 considers that there is a problem in solving it when you reach a guide that combines the Gemini model suggestions with the well -known symbolic engine principles.

Because of the complexities of the translation of proofs into coordination that Amnesty International can understand, there is a scarcity of use of useless engineering training. Deepmind has created its own artificial data to train the alphageometry2 language model, generating more than 300 million theories and proving the changing complexity.

Deepmind 45 has chosen engineering problems from IMO competitions over the past 25 years (from 2000 to 2024), including linear equations and equations that require engineering things transport around a plane. Then this “translate” into a group larger than 50 problems. (For technical reasons, some problems had to be divided into two parts.)

According to the paper, alphageometry2 solution 42 out of 50 problems, which scanned the degree of the gold medal of 40.9.

Grant, there are restrictions. Quirk artistic alphageometry2 prevents problem solving with a variable number of non -linear points and equations and inequality. And alphageometry2 is not Technically The first system of artificial intelligence reaches the performance of the gold level at the medal level in engineering, although it is the first to achieve it with a group of this size.

Alphageometry2 also worse on another group of the most difficult imo problems. For an additional challenge, DeepMind chose the problems – 29 in total – that were nominated for IMO exams by mathematics experts, but that has not yet appeared in a competition. Alphageometry2 can only solve 20 of these.

However, the results of the study are likely to feed on whether artificial intelligence systems should be built on the treatment of symbols-that is, the treatment of symbols that represent knowledge using the rules-or more neuromus in the brain.

Alphageometry2 depends on a mixed approach: the Gemini model has a nerve network structure, while its symbolic engine depends on the rules.

Supporters of nervous network technologies argue that smart behavior, from identifying speech to generating images, can come out of nothing more than huge amounts of data and computing. Against symbolic systems, which solve tasks by identifying groups of symbolic treatment rules for certain functions, such as editing a line in the text processor program, neural networks try to solve tasks through statistical approximation and learning from examples.

Nerve networks are the cornerstone of strong artificial intelligence systems like Openai’s O1 Model “Thinking”. However, the symbolic artificial intelligence supporters demand that they are not all of them; The symbolic AI website may be in a better position to concentrate the world’s knowledge efficiently, and cause their way through complex scenarios, and “explain” how they reached an answer, as these supporters argue.

“It is amazing to see the contrast between continuous and amazing progress in these types of standards, and at the same time, language models, including the most modern models with” thinking “, and continuing to conflict with some simple public problems,” Vince Konitez, which is Carnegie Mellon Professor of University computer science specialized in artificial intelligence, tell Techcrunch. “I don’t think all of this is all smoke and mirrors, but it shows that we still know the behavior that can be expected from the next system. It is possible that these systems are very impressive, so we urgently need to understand them and the risks they constitute are much better.”

Alphageometry2 may explain that the ritual – symbolic manipulation and nerve networks – may explain total It is a promising path forward in the search for generalized artificial intelligence. In fact, according to DeepMind, the O1, which also has a nerve network structure, has not been able to solve any IMO problems that alphageometry2 managed to answer.

This may not be the case forever. In the paper, the DeepMind team said it found initial evidence that the alphageometry2 language model was able to generate partial solutions to problems without the help of the symbolic engine.

“(() The results support ideas that can be self -sufficient to be self -sufficient without relying on external tools (such as symbolic engines),” but until the (model) is improved, the speed is improved. Hallucinogenic They are completely solved, the tools will remain necessary for mathematics applications. “

Leave a Reply

Your email address will not be published. Required fields are marked *