Comment by dwroberts
4 hours ago
Would be interesting to know what kind of preparatory work actually went into this - how long did it take to construct an input that produced a real result, and how much input did they get from actual mathematicians to guide refining it
Why?
It's clearly not yet a tool that can deliver new math at a scale. I say this because otherwise, the headline would be that they proved / disproved a hundred conjectures, not one. This is what happened with Mythos. You want to be the AI company that "solved" math, just like Anthropic got the headlines for "solving" (or breaking?) security.
The fact they're announcing a single success story almost certainly means that they've thrown a lot of money at a lot of problems, had experts fine-tuning the prompts and verifying the results, and it came back with a single "hit". But that doesn't make the result less important. We now have a new "solver" for math that can solve at least some hard problems that weren't getting solved before.
Whether that spells the end of math as we know... I don't think so, but math is a bit weird. It's almost entirely non-commercial: it's practiced chiefly in the academia, subsidized from taxes or private endowments, and almost never meant to solve problems of obvious practical importance - so in that sense, it's closer to philosophy than, say, software engineering. No philosopher is seriously worried about LLMs taking philosopher jobs even though they a chatbot can write an essay, but mathematicians painted themselves into a different corner, I think.
Says in the papers. "...which was first mathematically generated in one shot by an internal model at OpenAI, and then expositionally refined through human interactions with Codex."
Doesn't really matter the prep-work, what they say is it's a one-shot result, achieved by AI. The blog doesn't claim it was done by a currently public Model.