Comment by gyrovagueGeist
8 hours ago
Useful and interesting but likely still dangerous in production without connecting to formal verification tools.
I know o3 is far from state of the art these days but it's great at finding relevant literature and suggesting inequalities to consider but in actual proofs it can produce convincing looking statements that are false if you follow the details, or even just the algebra, carefully. Subtle errors like these might become harder to detect as the models get better.
100% o3 has a strong bias towards "write something that looks like a formal argument that appears to answer the question" over writing something sound.
I gave it a bunch of recent, answered MathOverflow questions - graduate level maths queries. Sometimes it would get demonstrably the wrong answer, but it not be easy to see where it had gone wrong (e.g. some mistake in a morass of algebra). A wrong but convincing argument is the last thing you want!