Comment by jfim

2 months ago

Counting letters is tricky for LLMs because they operate on tokens, not letters. From the perspective of a LLM, if you ask it "this is a sentence, count the letters in it" it doesn't see a stream of characters like we do, it sees [851, 382, 261, 21872, 11, 3605, 290, 18151, 306, 480].

4 comments

jfim

tintor 2 months ago

So what? It knows number of letters in each token, and can sum them together.

fzzzy 2 months ago
How does it know the letters in the token?
It doesn't.
There's literally no mapping anywhere of the letters in a token.
- ACCount37 2 months ago
  
  There is a mapping. An internal, fully learned mapping that's derived from seeing misspellings and words spelled out letter by letter. Some models make it an explicit part of the training with subword regularization, but many don't.
  It's hard to access that mapping though.
  A typical LLM can semi-reliably spell common words out letter by letter - but it can't say how many of each are in a single word immediately.
  But spelling the word out first and THEN counting the letters? That works just fine.
- danielscrubs 2 months ago
  
  If it did frequency analysis then I would consider it having a PhD level intelligence, not just a PhD level of knowledge (like a dictionary).