Comment by meroes
6 days ago
There's no evidence this model works like that. The "axioms" for counting the number of r's in a word are magnitudes simpler than classical physic's, and yet it took a few years to get that right. It's always been context, not derivation of logic.
First, false equivalence. The 'strawberry' problem was because LLMs operate not on text directly, but on embedding vectors, which made it hard for it to manipulate the syntax of language directly. This does not prevent it from properly doing math proofs.
Second, we know nothing about these models or how they work and trained, and indeed, if they can do these things or not. But a smart human could (by smart I mean someone who gets good grades at engineering school effortlessly, not Albert Einstein)