Comment by Nevermark

6 months ago

That was always a specious test.

LLMs don't ingest text a character at a time. The difficulty with analyzing individual letterings just reflected that they don't directly "see" letters in their tokenized input.

A direct comparison would be asking someone how many convex Bézier curves are in the spoken word "monopoly".

Or how many red pixels are in a visible icon.

We could work out answers to both. But they won't come to us one-shot or accurately, without specific practice.

0 comments

Nevermark

No comments yet

Contribute on Hacker News ↗