Comment by Antibabelic
6 hours ago
This is my problem with people claiming that LLMs "understand". What we usually call "meaning" in intricately related to an encyclopedic knowledge of the world around us. How does this kind of knowledge not get into the same kind of loop you've just described? It is ultimately founded on our direct experience of the world, on sense data. It is ultimately embodied knowledge.
Vector spaces and bag of words models are not specifically related to LLMs, so I think that's irrelevant to this topic. It's not about "knowledge", just the ability to represent words in such a way that similarities between them take on useful computational characteristics.
Well, pretty much all of the LLMs are based on the decode-only version of the Transformer architecture (in fact it’s the T in GPT).
And in the Transformer architecture you’re working with embeddings, which are exactly what this article is about, the vector representation of words.
I really recommend watching this section of this video. Embeddings do encode plenty of "human knowledge" into the vector values and their relations to each other.
https://youtu.be/wjZofJX0v4M?si=QEaPWcp3jHAgZSEe&t=802
This even opens up a more data-based approach to linguistics, where it is also heavily used.
s/embodied/embedded/, and this is how LLMs understand.
As others already mentioned, the secret is that arithmetic is done on vector in high-dimensional space. The meaning of concepts is in how they relate to each other, and high dimensional spaces end up being a surprisingly good representation.
And what are we if not a bunch of interconnected atoms? Smash a person to paste and you will not have any deep meaning in them, no life, nor sublime substance making them different from the dust they were made of. What is special in humans? Aren't we just an especially complex hydrocarbon mass that receives external stimuli and remaps them to a physical output? What makes you think that there is something more inside?
There’s nothing special about having an embodiment. A robot has an embodiment of sorts. An LLM meanwhile is a brain in a vat.
And there’s nothing special about my 21x23 square feet lawn. Can you emulate it? To what fidelity? How much should the map correspond to the territory? The same squarage, the same elevations down to the millimeter?
You’re not saying anything that counters the point that was made. Just mentioning stuff that people animals are made of with the assumed strawman argument (not made) that there is any non-physical essence at play. There isn’t.
Put a camera and some feet on an LLM and maybe it has an embodimeent.As long as it just has digital input it does not in the sense being discussed here.
What I am talking about concerns how human language relates to meaning. I'm not sure what this has to do with humans being "special". Saying that humans are "just an especially complex hydrocarbon mass that receives external stimuli and remaps them to a physical output" misses the point that what data we have available to us is qualitatively different from that of today's best natural language generation software.
Meaning in the best case is correspondence between a word and a grop of other sensory inputs that an embedding lacks. So when you complain that this lacks meaning, the core of it is that it does not look powerful enough.
Give me a better definition of meaning and I might change my mind on the topic.