Comment by demirbey05

7 months ago

Progress is astounding. Recently report published about evaluation of LLMs on IMO 2025. o3 high didn't even get bronze.

Waiting for Terry Tao's thoughts, but these kind of things are good use of AI. We need to make science progress faster rather than disrupting our economy without being ready.

25 comments

demirbey05

davis 7 months ago

Here they are: https://mathstodon.xyz/@tao/114881419368778558

demirbey05 7 months ago

Appreciated
> I will not be commenting on any self-reported AI competition performance results for which the methodology was not disclosed in advance of the competition.

ktallett 7 months ago

[flagged]

dang 7 months ago

Please see https://news.ycombinator.com/item?id=44617609.
You degraded this thread badly by posting so many comments like this.
saagarjha 7 months ago
I did competitive math in high school and I can confidently say that they are anything but "basic". I definitely can't solve them now (as an adult) and it's likely I never will. The same is true for most people, including people who actually pursued math in college (I didn't). I'm not going to be the next guy who unknowingly challenges a Putnam winner to do these but I will just say that it is unlikely that someone who actually understands the difficulty of these problems would say that they are not hard.
For those following along but without math specific experience: consider whether your average CS professor could solve a top competitive programming question. Not Leetcode hard, Codeforces hard.
- samat 7 months ago
  
  Thanks for speaking sense. I think 99% of people saying IMO problems are not hard would not be able to solve basic district-level competition problems and are just not equipped to judge the problems.
  And 1% here are those IMO/IOI winners who think everyone is just like them. I grew up with them and to you, my friends, I say: this is the reason why AI would not take over the world (and might even not be that useful for real world tasks), even if it wins every damn contest out there.
  
  1 reply →
Aurornis 7 months ago

> I assume you are aware of the standard of Olympiad problems and that they are not particularly high.
Every time an LLM reaches a new benchmark there’s a scramble to downplay it and move the goalposts for what should be considered impressive.
The International Math Olympiad was used by many people as an example of something that would be too difficult for LLMs. It has been a topic of discussion for some time. The fact that an LLM has achieved this level of performance is very impressive.
You’re downplaying the difficulty of these problems. It’s called international because the best in the entire world are challenged by it.
Davidzheng 7 months ago
sorry but I don't think it's accurate to say "they are just challenging for the age range"
- ktallett 7 months ago
  
  I'm aware you believe they are impossible tasks unless you have specific training, I happen to disagree with that.
  
  2 replies →
demirbey05 7 months ago
I mean progress speed, few months ago they released o3 it has 16 pt in imo 2025
- ktallett 7 months ago
  
  In that regards I would agree but that to me suggests that prior hype was unbased though.
zug_zug 7 months ago
I feel like I've noticed you you making the same comment 12 places in this thread -- incorrectly misrepresenting the difficulty of this tournament and ultimately it comes across as a bitter ex.
Here's an example problem 5:
Let a1,a2,…,an be distinct positive integers and let M=max⁡1≤i<j≤n.
Find the maximum number of pairs (i,j) with 1≤i<j≤n for which (ai +aj )(aj −ai )=M.
- causal 7 months ago
  
  What does max⁡1≤i<j≤n mean? Wouldn't M always be j?
  
  2 replies →
- causal 7 months ago
  
  Where did you get this? Don't see it on the 2025 problem set and now I wanna see if I have the right answer
  
  3 replies →
- ktallett 7 months ago
  
  Hence proofs as I've stated.
  
  2 replies →