Comment by janalsncm
2 days ago
If I have to give it the algorithm as well as the problem, we’re no longer even pretending to be in the AGI world. If it falls down interpreting an algorithm it is worse than even a python interpreter.
Towers of Hanoi is a well-known toy problem. The algorithm is definitely in any LLM’s training data. So it doesn’t even need to come up with a new algorithm.
There may be some technical reason it’s failing but the more fundamental reason is that an autoregressive statistical token generator isn’t suited to solving problems with symbolic solutions.
I'm just saying ~10MB of short repetitive text lines might be out of scope as a response the LLM driver is willing to give at all, regardless of how derived
In the example someone else gave, o3 broke down after 95 lines of text. That’s far short of 10 MB.