Comment by menaerus

6 months ago

> If performance of o3 doesn't meet expectations, there'll be plenty of people making excuses for it

I agree and I can definitely see that happening but it is also not impossible, given the incentive and impact of this technology, for some other company/community to create yet another, perhaps, FrontierMath-like benchmark to cross-validate the results.

I also don't disagree that it is not impossible for OpenAI to have faked these results. Time will tell.