← Back to context

Comment by WD-42

6 days ago

> Gemini 2.5 Pro achieved the highest score with an average of 31% (13 points). While this may seem low, especially considering the $400 spent on generating just 24 answers

What? That’s some serious cash for mostly wrong answers.

The time investment a human has to make to get 31% on the IMO is worth far more than $400

  • The human still has to put in that time. How would you know what 31% is correct?