← Back to context

Comment by calibas

13 days ago

> Is/was the same true for ASCII/Smalltalk/binary? They are all another way to translate language into something the computer "understands".

That's converting characters into a digital representation. "A" is represented as 01000001. The tokenization process for an LLM is similar, but it's only the first step.

An LLM isn't just mapping a word to a number, you're taking the entire sentence, considering the position of the words and converting it into vectors within a 1,000+ dimensional space. Machine learning has encoded some "meaning" within these dimensions that goes far far beyond something like an ASCII string.

And the proof here is that the method actual works, that's why we have LLMs.