← Back to context

Comment by svachalek

2 years ago

With the caveat that I'm not an LLM expert here but have read up on some of this...

What's basically going on is that the LLM has "read" vast amounts of text and classified words in all kinds of dimensions. As it goes from word to word it's predicting based on all of those dimensions, rather than the Markov chain's simple probabilities. So it knows "king" has high values of masculinity and authority, for example, and "man" has high values of masculinity without necessary the authority. Likewise for queen and woman. This also works for connective words like plus, minus, equals, etc. This leads the LLM to judge correctly that the most plausible word to continue that equation is "queen".

With enough dimensions and enough context (preceding words to include in the calculation) you get results that look like reasoning and intelligence -- and although a lot of people have started arguing that we need to define reasoning and intelligence in a way that excludes this, I'm not so sure of that. It seems quite possible that what goes on in our own heads is not so far from this with a few extra steps.

So you're saying that the word2vec king/queen result was the result of feeding the verbatim text "king+man-woman", and that text being continued as "queen"? I assumed that was more the result of doing math on the properties the model generated for the tokens king, man, queen, etc.

And in that case, why does "Write me a limerick <meeting this criteria>" result in ChatGPT producing limericks, when vanishingly few of the limericks in the source text started that way, and vanishingly few of the commands to write a limerick were immediately followed with one?

  • > I assumed that was more the result of doing math on the properties the model generated for the tokens king, man, queen, etc.

    This isn't wrong, the model does do a bunch of "math" (HUGE matrix and vector multiplications) on the vectors generated by the tokens, but it's not like the model has any recognizable process that resembles our own reasoning process (unless you ask it to explain "its own reasoning", but the fact that the model explains it in terms of human reasoning doesn't mean that's the way it works internally -- as of today LLMs do not have introspection capabilities).

    I suspect most people who haven't thought very deeply about what computation, reasoning and intelligence is would need some time to come to terms to what generative AI is telling us about them.

    If it helps, do note that the models have been trained on terrabytes of training data, and the model has "learnt" a bunch of patterns that it could apply to "king+man-woman=" to come to "queen" as the answer. We could even speculate that the word2vec vectors for king and queen would have some dimensions that roughly translates to "class" (monarch), and "gender" (same for man and woman), and it would be relatively straightforward for the model to grab the token that has a signal on those same dimensions. But that's kind of a crude way to look at LLMs since the billions of parameters aren't there only to make money for nVidia, so the actual processes involve much, much, much more compute that human minds would not be able to comprehend.

  • You’d also get text that has very little to do with the training material from any statistical model. A prediction may have nothing to do with the past and it might be wrong or right. For example, the weather forecast said it will rain all day, but the sun is up and bright in the sky without a cloud in sight.

    The model knows what a limerick is, from the source material. It knows what your criteria is from the source material. It can predict what someone would say given that prompt.

    Humans also do this. I’m usually one or two words ahead of the person I’m speaking to, sometimes even entire paragraphs ahead if I’m paying full attention. My dreams give me unrealistic situations to explore new ways of dealing with them. When I write code, I have a pretty good idea of what I’m going to write before I write it.

    The main difference between a human and an LLM, is that a human has no limit. A human will still continue when overwhelmed with data, usually by shedding unimportant data. An LLM will just tell you it’s too much data. Smart humans won’t just shed the data, but “mark” it mentally as potentially important in the future and come back to it once a deeper understanding is achieved.

    There are other, smaller differences as well, but that is the biggest, most annoying one, so far.