← Back to context

Comment by Tepix

1 year ago

LLMs think in tokens, not letters. It's like asking someone who is dyslexic about spelling. Not their strong suit. In practice, it doesn't matter much, does it?

Sometimes it does, sometimes it doesn't.

It is evidence that LLMs aren't appropriate for everything, and that there could exist something that works better for some tasks.

  • Language models are best treated like consciousness. Our consciousness does a lot less than people like to attribute to it. It is mostly a function of introspection and making connections, rather than being the part of the brain where higher level reasoning and the functions of the brain that tell your body how to stay alive (like beating your heart).

    By allowing a language model to do function calling, you are essentially allowing it to do specialized "subconscious" thought. The language model becomes a natural language interface to the capabilities of its "subconsciousness".

    A specific human analogy could be: I tell you to pick up a pen off of the table, and then you do it. Most of your mental activity would be subconscious, orienting your arm and hand properly to pick up the pen, actually grabbing the pen, and picking it up. The linguistic representation of the action would exist in your concious mind (pick up the pen), but not much else.

    A language model could very easily call out to a text processing function to correctly do things like count the number of r's in the word strawberry. That is a job that your concious mind can dispatch to your subconciousness.

This doesn't make any sense. The LLM knows the token to letter associations if you ask it. Once it has split the token into letters, it can answer the question anyway.

This tells us that the model is unaware of its own limitations, otherwise it would have figured that this tasks needs it to split the letters first.

  • Well splitting into letters doesn't help with Mistral Large 2. I tried this with Mistral 2:

    [Question: How many r's in strawberry? First split the word into letters, one letter on each line then answer the question.]

    [Answer: Let's split the word "strawberry" into letters:

    s t r a w b e r r y

    Now, let's count the number of 'r's:

    There are 2 'r's in the word "strawberry".]

    [Question: Did you count the number of r's?]

    [answer: Yes, I did. Here's the count:

    r r

    So, there are 2 'r's in the word "strawberry".]

    I suspect the issue is these models have to be trained to accomplish tasks, and this is not a task it makes sense to train a model to perform. You might train it to do this with function calling one day, calling a python library to count the r's in a string, but actually manually counting r's just doesn't play to an LLM's strengths.

  • Does it? It knows how a word in some context represented by tokens split into letters. It may know or not know the table. I wouldn’t trust what it tells about word/token correspondence more than in general.