Comment by otabdeveloper4
5 days ago
LLMs can often guess the final answer, but the intermediate proof steps are always total bunk.
When doing math you only ever care about the proof, not the answer itself.
5 days ago
LLMs can often guess the final answer, but the intermediate proof steps are always total bunk.
When doing math you only ever care about the proof, not the answer itself.
Yep, I remember a friend saying they did a maths course at university that had the correct answer given for each question - this was so that if you made some silly arithmetic mistake you could go back and fix it and all the marks were for the steps to actually solve the problem.
This would have greatly helped me. I always was at a loss which trick I had to apply to solve this exam problem, while knowing the mathematics behind it. Just at some point you had to add a zero that was actually a part of a binomial that then collapsed the whole fromula
Not in this case: the LLM wrote the entire paper, and anyway the proof was the answer.
Once you have a working proof, no matter how bad, you can work towards making it nicer. It's like refactoring in programming.
If your proof is machine checkable, that's even easier.
That is also how humans work mostly. Once every full moon we may get an "intuition" but most of the time we lean on collective knowledge, biases and behavior patterns to take decisions, write and talk.
I haven't had success in getting AI's to output working proofs.
You'd need a completely different post-training and agent stack for that.
What’s funny is that there are total cranks in human form that do the same thing. Lots of unsolicited “proofs” being submitted by “amateur mathematicians” where the content is utter nonsense, but like a monkey with a typewriter, there’s the possibility that they stumble upon an incredible insight.