The last time someone claimed a medal in an olympiad like this, turned out they manually translated the problem into Lean and then ran a brute force search algorithm to find a proof. For 60 hours. On a supercomputer.
Meanwhile high schoolers get a piece of paper and 4.5 hours.
Even though chess is now effectively solved against human players, I still remember Kasparov's suspicion that one of Deep Blue's moves had a human touch. It was never proven or disproven, but I trust Kasparov's deep intuition amplified by Kasparov requesting access to Deep Blue’s logs, and IBM refusing to share them in full. For more discussions see [1][2][3].
The last time someone claimed a medal in an olympiad like this, turned out they manually translated the problem into Lean and then ran a brute force search algorithm to find a proof. For 60 hours. On a supercomputer.
Meanwhile high schoolers get a piece of paper and 4.5 hours.
Even though chess is now effectively solved against human players, I still remember Kasparov's suspicion that one of Deep Blue's moves had a human touch. It was never proven or disproven, but I trust Kasparov's deep intuition amplified by Kasparov requesting access to Deep Blue’s logs, and IBM refusing to share them in full. For more discussions see [1][2][3].
[1] https://chess.stackexchange.com/questions/9959/did-deep-blue...
[2] https://nautil.us/why-the-chess-computer-deep-blue-played-li...
[3] https://en.chessbase.com/post/deep-blue-s-cheating-move
kinda wild that an llm cant translate to lean?
Both are true. One spent $400 in compute and the other one spent a lot more.
Exactly. And presumably had a more sophisticated harness around the model, longer reasoning chains, best of N, self judging, etc
OpenAI achieved Gold on an unreleased model. GPT-5. Read the tweets and they explain what they did.
Actually, I did it a year ago but I just don't want to release my model.
Where should I address the billion dollar check?
My buddy did it 5 years ago. You wouldn’t know him, he lives in Canada.
my model goes to a different school
The dog ate mine. And the solution didn't fit in the margin, anyway.
OpenAI explicitly said it’s not GPT-5 but another experimental research model https://x.com/alexwei_/status/1946477756738629827?s=46
Thanks. I parsed that wrong. In either case not the same thing Math Arena used.