Comment by modeless

7 months ago

It seems very likely from the description in the link that formal verification tools for mathematical proofs were used in part of the RL training for this model. On the other hand, OpenAI claims "We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling." Which might suggest that they don't specifically use e.g. Lean in their training process. But it's not explicitly stated. All we can really do is speculate unless they publish more detail.

1 comment

modeless

getnormality 7 months ago

The OpenAI proofs are so brutally, inhumanly spartan that I can't imagine how the AI came up with them, except by RLVR against some crudely translated formal language.