← Back to context

Comment by stingraycharles

5 days ago

They also said this is not part of GPT-5, and “will be released later”. It’s very, very likely a model specifically fine-tuned for this benchmark, where afterwards they’ll evaluate what actual real-world problems it’s good at (eg like “use o4-mini-high for coding”).

Humans who excel at IMO questions are also "fine tuned" on them in the sense that they practice them for hundreds of hours

  • Their hardware isn't fine tuned to it though, it uses the same general intelligence hardware that all other humans use.

    So its a big difference if you use a general intelligence system and makes it do well in math, or when you create a specialized system that is only good at math and can't be used to get good in other areas.