Humans defeated generative AI models created by Google and OpenAI in a prestigious international math competition, even though the programs achieved "gold" level results for the first time.
None of the models received the maximum number of points—unlike five young people at the International Mathematical Olympiad (IMO), a prestigious annual competition in which participants must be under the age of 20.
Google announced that an advanced version of its Gemini chatbot solved five of the six math problems set by the IMO, which took place this month in Queensland, Australia.
"We can confirm that Google DeepMind has achieved the long-awaited success by winning 35 out of a possible 42 points – a result that earns it a gold medal," IMO President Gregor Dolinar said, quoted by AFP.
"Their solutions were remarkable in many ways. The IMO judges found them clear, precise, and most of them easy to follow," he added.
About 10% of human participants won gold medals, and five received a perfect score of 42 points.
US ChatGPT manufacturer OpenAI said its experimental reasoning model achieved a gold score of 35 points on the test.
The result "achieved a long-standing major goal in the field of artificial intelligence" at "the world's most prestigious math competition," OpenAI researcher Alexander Wei wrote on social media.
"We evaluated our models on the 2025 IMO tasks using the same rules as human participants. For each task, three former IMO medalists independently evaluated the model's submitted proof," he added.
Google won a silver medal at last year's IMO in the British city of Bath, solving four of the six problems.
That took two to three days of computation — much longer than this year, when the Gemini model solved the problems within the 4.5-hour time limit, the company said.
The IMO said that technology companies had "tested privately closed AI models on this year's problems," the same ones faced by 641 competitors from 112 countries.
"It's very exciting to see the progress in the mathematical abilities of AI models," IMO President Dolinar said.
The competition organizers were unable to verify how much computing power the AI models used and whether there was human intervention, he cautioned. | BGNES