Comment by WD-42
6 days ago
> Gemini 2.5 Pro achieved the highest score with an average of 31% (13 points). While this may seem low, especially considering the $400 spent on generating just 24 answers
What? That’s some serious cash for mostly wrong answers.
The time investment a human has to make to get 31% on the IMO is worth far more than $400
The human still has to put in that time. How would you know what 31% is correct?