← Back to context

Comment by YeGoblynQueenne

5 days ago

>> Why is that less exciting? A machine competing in an unconstrained natural language difficult math contest and coming out on top by any means is breath taking science fiction a few years ago - now it’s not exciting?

Half the internet is convinced that LLMs are a big data cheating machine and if they're right then, yes, boldly cheating where nobody has cheated before is not that exciting.

I don't get it, how do you "big data cheat" an AI into solving previously unencountered problems? Wouldn't that just be engineering?

  • I don’t know how you’d cheat at it either, but if you could, it would manifest as the model getting gold on the test and then in six months when its released to the public, exhibiting wild hallucinations and basic algebra errors. I don’t kbow if that’s how it’ll play out this time but I know how it played out the last ten.

  • It depends on what you mean by "engineering". For example, "engineering" can mean that you train and fine-tune a machine learning system to beat a particular benchmark. That's fun times but not really interesting or informative.

  • > previously unencountered problems

    I haven't read the IMO problems, but knowing how math Olympiad problems work, they're probably not really "unencountered".

    People aren't inventing these problems ex nihilo, there's a rulebook somewhere out there to make life easier for contest organizers.

    People aren't doing these contests for money, they are doing them for honor, so there is little incentive to cheat. With big business LLM vendors it's a different situation entirely.

  • I mean, solutions for the 2025 IMO problems are already available on the internet. How can we be sure these are “unencountered” problems?