Comment by modeless
20 hours ago
It seems very likely from the description in the link that formal verification tools for mathematical proofs were used in part of the RL training for this model. On the other hand, OpenAI claims "We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling." Which might suggest that they don't specifically use e.g. Lean in their training process. But it's not explicitly stated. All we can really do is speculate unless they publish more detail.
The OpenAI proofs are so brutally, inhumanly spartan that I can't imagine how the AI came up with them, except by RLVR against some crudely translated formal language.