Comment by riku_iki
6 months ago
> If they had, surely they could have gotten themselves an even higher mark than 25%.
there is potentially some limitation of LLMs memorizing such complex proofs
6 months ago
> If they had, surely they could have gotten themselves an even higher mark than 25%.
there is potentially some limitation of LLMs memorizing such complex proofs
They aren't proofs, they're just numbers. All the questions have numerical answers. That's how they're evaluated.
I think those reasoning models are smart enough to not emit memorized answer if they can't come with CoT proof.
But OAI could draw any result, no one was checking, they probably were not brave enough to declare math as solved topic.