Comment by energy123

20 hours ago

A surprising % of these LLM proofs are coming from amateurs.

One wonders if some professional mathematicians are instead choosing to publish LLM proofs without attribution for career purposes.

It's probably from the perennial observation

"This LLM is kinda dumb in the thing I'm an expert in"

>One wonders if some professional mathematicians are instead choosing to publish LLM proofs without attribution for career purposes.

This will just become the norm as these models improve, if it isn't largely already the case.

It's like sports where everyone is trying to use steroids, because the only way to keep up is to use steroids. Except there aren't any AI-detectors and it's not breaking any rules (except perhaps some kind of self moral code) to use AI.

I think a more realistic answer is that professional mathematicians have tried to get LLMs to solve their problems and the LLMs have not been able to make any progress.

  • I think it's a bit early to tell whether GPT 5.2 has helped research mathematicians substantially given its recency. The models move so fast that even if all previous models were completely useless I wouldn't be sure this one would be. Let's wait a year and see? (it takes time to write papers)

    • It's helped, but it's not correct that mathematicians are scoring major results by just feeding their problems to gpt 5.2 pro, so the OP claim that mathematicians are just playing off AI output as their own is silly. Here, im talking about serious mathematical work, not people posting (unattributed AI slop to the arXiv).

      I assume OP was mostly joking, but we need to take care about letting AI companies hype up their impressive progress at the expense of mathematics. This needs to be discussed responsibly.

I'm actually not sure what the right attribution method would be. I'd lean towards single line on acknowledgements? Because you can use it for example @ every lemma during brainstorming but it's unclear the right convention is to thank it at every lemma...

Anecdotally, I, as a math postdoc, think that GPT 5.2 is much stronger qualitatively than anything else I've used. Its rate of hallucinations is low enough that I don't feel like the default assumption of any solution is that it is trying to hide a mistake somewhere. Compared with Gemini 3 whose failure mode when it can't solve something is always to pretend it has a solution by "lying"/ omitting steps/making up theorems etc... GPT 5.2 usually fails gracefully and when it makes a mistake it more often than not can admit it when pointed out.