Comment by Nevermark
5 months ago
That was always a specious test.
LLMs don't ingest text a character at a time. The difficulty with analyzing individual letterings just reflected that they don't directly "see" letters in their tokenized input.
A direct comparison would be asking someone how many convex Bézier curves are in the spoken word "monopoly".
Or how many red pixels are in a visible icon.
We could work out answers to both. But they won't come to us one-shot or accurately, without specific practice.
No comments yet
Contribute on Hacker News ↗