> Shock! Shock! I learned yesterday that an open problem I’d been working on for several weeks had just
been solved by Claude Opus 4.6— Anthropic’s hybrid reasoning model that had been released three weeks
earlier! It seems that I’ll have to revise my opinions about “generative AI” one of these days. What a joy
it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in
automatic deduction and creative problem solving.
I think we're going to have several years of people claiming genAI "didn't really do something novel here," despite experts saying otherwise, because people are scared by the idea that complex problem solving isn't exclusive to humans (regardless of whether these models are approaching general intelligence).
I encourage you to look at what the current models with a bit of harnessing are capable of, e.g. Opus 4.6 and Claude Code. Try to make it solve some mathematics-heavy problem you come up with. If only to get a more accurate picture of whats going on.
Unfortunately, these tools generalize way beyond regurgitating the training set.
I would not assume they stay below human capabilities in the next few years.
Why any moral person would continue building these at this point I don't know.
I guess in the best case the future will have a small privileged class of humans having total power, without need for human workers or soldiers.
Picture a mechanical boot stomping on a human face forever.
I would like to note that it would be trivial to definitively prove or disprove such things if we had a searchable public archive of the training data. Interestingly, the same people (and corporate entities) who loudly claim that LLMs are creating original work seem to be utterly disinterested in having actual, definitive proof of their claims.
Was it? It was an open problem to Knuth - who generally knows how to search literature. However there is enough literature to search that it wouldn't be a surprise at all to discover it was already solved but he just used slightly different terms and so didn't find it. Or maybe it was sovled because this is a specialization of something that looks unrelated and so he wouldn't have realized it when he read it. Or...
Overall I'm going with unsolved, because Knuth is a smart person who I'd expect to not miss the above. I'm also sure he falls for the above all the time even though the majority of the time he doesn't.
Agreed with all of that, but with the added point that Knuth has done a lot of work in this exact area in The Art of Computer Programming Volume 4. If he considers this conjecture open given his particular knowledge of the field, it likely is (although agreed, it's not guaranteed).
Opening sentences:
> Shock! Shock! I learned yesterday that an open problem I’d been working on for several weeks had just been solved by Claude Opus 4.6— Anthropic’s hybrid reasoning model that had been released three weeks earlier! It seems that I’ll have to revise my opinions about “generative AI” one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving.
I think we're going to have several years of people claiming genAI "didn't really do something novel here," despite experts saying otherwise, because people are scared by the idea that complex problem solving isn't exclusive to humans (regardless of whether these models are approaching general intelligence).
I encourage you to look at what the current models with a bit of harnessing are capable of, e.g. Opus 4.6 and Claude Code. Try to make it solve some mathematics-heavy problem you come up with. If only to get a more accurate picture of whats going on.
Unfortunately, these tools generalize way beyond regurgitating the training set. I would not assume they stay below human capabilities in the next few years.
Why any moral person would continue building these at this point I don't know. I guess in the best case the future will have a small privileged class of humans having total power, without need for human workers or soldiers. Picture a mechanical boot stomping on a human face forever.
If this was a joke, it certainly flew over most people's heads...
Prove it.
I would like to note that it would be trivial to definitively prove or disprove such things if we had a searchable public archive of the training data. Interestingly, the same people (and corporate entities) who loudly claim that LLMs are creating original work seem to be utterly disinterested in having actual, definitive proof of their claims.
This would be awesome. Even titles and shasums could be enough.
Did you read the article? It was an open problem.
Was it? It was an open problem to Knuth - who generally knows how to search literature. However there is enough literature to search that it wouldn't be a surprise at all to discover it was already solved but he just used slightly different terms and so didn't find it. Or maybe it was sovled because this is a specialization of something that looks unrelated and so he wouldn't have realized it when he read it. Or...
Overall I'm going with unsolved, because Knuth is a smart person who I'd expect to not miss the above. I'm also sure he falls for the above all the time even though the majority of the time he doesn't.
Agreed with all of that, but with the added point that Knuth has done a lot of work in this exact area in The Art of Computer Programming Volume 4. If he considers this conjecture open given his particular knowledge of the field, it likely is (although agreed, it's not guaranteed).
2 replies →