Comment by falcor84
6 days ago
I don't get it, how do you "big data cheat" an AI into solving previously unencountered problems? Wouldn't that just be engineering?
6 days ago
I don't get it, how do you "big data cheat" an AI into solving previously unencountered problems? Wouldn't that just be engineering?
I don’t know how you’d cheat at it either, but if you could, it would manifest as the model getting gold on the test and then in six months when its released to the public, exhibiting wild hallucinations and basic algebra errors. I don’t kbow if that’s how it’ll play out this time but I know how it played out the last ten.
It depends on what you mean by "engineering". For example, "engineering" can mean that you train and fine-tune a machine learning system to beat a particular benchmark. That's fun times but not really interesting or informative.
> previously unencountered problems
I haven't read the IMO problems, but knowing how math Olympiad problems work, they're probably not really "unencountered".
People aren't inventing these problems ex nihilo, there's a rulebook somewhere out there to make life easier for contest organizers.
People aren't doing these contests for money, they are doing them for honor, so there is little incentive to cheat. With big business LLM vendors it's a different situation entirely.
I mean, solutions for the 2025 IMO problems are already available on the internet. How can we be sure these are “unencountered” problems?
They probably have an archived data set from before then that they trained on.
They would if theyre honest. Which we just dont know for sure these days