Comment by skepticATX

7 months ago

Has anyone independently reviewed these solutions?

My proving skills are extremely rusty so I can’t look at these and validate them. They certainly are not traditional proofs though.

2 comments

skepticATX

I read through P1, and it seemed to be correct. Though you could explain the central idea of the proof into about 3 sentences and a few drawings.

It reads like someone who found the correct answer but seemingly had no understanding of what they did and just handed in the draft paper.

Which seems odd, shouldn't an LLM be better at prose?

matt123456789 7 months ago

One would think. I suppose OpenAI threw the majority of their compute budget at producing and verifying solutions. It would certainly be interesting to see whether or not this new model can distill its responses to just those steps necessary to convey its result to a given audience.