Comment by SlinkyOnStairs
20 hours ago
One major limitation of the LLM architecture is that even the failure mode varies unpredictably between inputs.
The set of 11-digit numbers with any given failure mode (or even successful output) has no discernable pattern, merely whatever randomness the training process baked into the model.
You can't predict ahead of time when they will fail spectacularly, nor draw a clear boundary around the failure cases. And early major example of this were the "glitch tokens" introduced into most LLMs by training on reddit data.
But there is an "in general"/"average failure rate across all inputs of a given size" answer: LLMs performance drops off a cliff once the input reaches too much complexity. (A "┐" shaped curve) In contrast to humans, where you can ask a child to add two N-digit numbers and the error rate will be approximately linear to N.
Most humans struggle to compute 10 digit stuff. They use tools instead. Can LLM learn to use calculator? Sorry if that is a stupid question. Maybe brains are not well suited for calculations natively.
Yes. LLMs use calculators to great effect. More often, Python as a calculator.
Also, there exist autistic savants who prove that a human brain can be used to perform rote calculations on large numbers much faster than a human with a calculator can.