Comment by viccis

3 days ago

Every day I see people treat gen AI like a thinking human, Dijkstra's attitudes about anthropomorphizing computers is vindicated even more.

That said, I think the author's use of "bag of words" here is a mistake. Not only does it have a real meaning in a similar area as LLMs, but I don't think the metaphor explains anything. Gen AI tricks laypeople into treating its token inferences as "thinking" because it is trained to replicate the semiotic appearance of doing so. A "bag of words" doesn't sufficiently explain this behavior.

One metaphor is to call the model a person, another metaphor is to call it a pile of words. These are quite opposite. I think that's the whole point.

Person-metaphor does nothing to explain its behavior, either.

"Bag of words" has a deep origin in English, the Anglo-Saxon kenning "word-hord", as when Beowulf addresses the Danish sea-scout (line 258)

"He unlocked his word-hoard and delivered this answer."

So, bag of words, word-treasury, was already a metaphor for what makes a person a clever speaker.

I'll make the following observation:

The contra-positive of "All LLMs are not thinking like humans" is "No humans are thinking like LLMs"

And I do not believe we actually understand human thinking well enough to make that assertion.

Indeed, it is my deep suspicion that we will eventually achieve AGI not by totally abandoning today's LLMs for some other paradigm, but rather embedding them in a loop with the right persistence mechanisms.

  • Given that LLMs are incapable of synthetic a priori knowledge and humans are, I would say that as the tech stands currently, it's reasonable to make both of those statements.

  • The loop, or more precisely the "search" does the novel part in thinking, the brain is just optimizing this process. Evolution could manage with the simplest model - copying with occasional errors, and in one run it made everyone of us. The moral - if you scale search the model can be dumb.

    • Let’s not underestimate the scale of the search which led to us though, even though you may be right in principle. In addition to deep time on earth, we may well be just part of a tiny fraction of a universe-wide and mostly fruitless search.

For me, the problem is in the "chat" mechanic that OpenAI and others use to present the product. It lends itself to strong antropomorphizing.

If instead of a chat interface we simply had a "complete the phrase" interface, people would understand the tool better for what it is.

  • But people aren't using ChatGPT for completing phrases. They're using it to get their tasks done, or get their questions answered.

    The fact that pretraining of ChatGPT is done with a "completing the phrase" task has no bearing on how people actually end up using it.

    • It's not just the pretraining, it's the entire scaffolding between the user and the LLM itself that contributes to the illusion. How many people would continue assuming that these chatbots were conscious or intelligent if they had to build their own context manager, memory manager, system prompt, personality prompt, and interface?

  • I agree 100%. Most people haven't actually interacted directly with an LLM before. Most people's experience with LLMs is ChatGPT, Claude, Grok, or any of the other tools that automatically handle context, memory, personality, temperature, and are deliberately engineered to have the tool communicate like a human. There is a ton of very deterministic programming that happens between you and the LLM itself to create this experience, and much of the time when people are talking about the ineffable intelligence of chatbots, it's because of the illusion created by this scaffolding.

Yea bag of words isn’t helpful at all. I really do think that “superpowered sentence completion” is the best description. Not only is it reasonably accurate it is understandable, everyone has seen autocomplete function, and it’s useful. I don’t know how to “use” a bag of words. I do know how to use sentence completion. It also helps explains why context matters.

  • Sentence completion does not give it justice, when I can ask a LLM to refactor my repo and come back half an hour later to see the deed done.

    • Thats the thing, when you use an Ask/answer mechanism, you are just writing a "novel" where User: asks and personal coding assistant: answers. But all the text goes into the autocomplete function and the "toaster" outputs the most probable text according to the function.

      Its useful, it's amazing, but as the original text says, thinking of it as "some intelligence with reasoning " makes us use the wrong mental models for it.

  • I've been recently using a similar description, referring to "AI" (LLMs) as "glorified autocomplete" or "luxury autocomplete".

Bag of words is actually the perfect metaphor. The data structure is a bag. The output is a word. The selection strategy is opaquely undefined.

> Gen AI tricks laypeople into treating its token inferences as "thinking" because it is trained to replicate the semiotic appearance of doing so. A "bag of words" doesn't sufficiently explain this behavior.

Something about there being significant overlap between the smartest bears and the dumbest humans. Sorry you[0] were fooled by the magic bag.

[0] in the "not you, the layperson in question" sense

  • I think it's still a bit of a tortured metaphor. LLMs operate on tokens, not words. And to describe their behavior as pulling the right word out of a bag is so vague that it applies every bit as much to a Naive Bayes model written in Python in 10 minutes as it does to the greatest state of the art LLM.

  • Yeah. I have a half-cynical/half-serious pet theory that a decent fraction of humanity has a broken theory of mind and thinks everyone has the same thought patterns they do. If it talks like me, it thinks like me.

    Whenever the comment section takes a long hit and goes "but what is thinking, really" I get slightly more cynical about it lol

    • Why not?

      By now, it's pretty clear that LLMs implement abstract thinking - as do humans.

      They don't think exactly like humans do - but they sure copy a lot of human thinking, and end up closer to it than just about anything that's not a human.

      2 replies →

Spoken Query Language? Just like SQL, but for unstructured blobs of text as a database and unstructured language as a query? Also known as Slop Query Language or just Slop Machine for its unpredictable results.

  • > Spoken Query Language? Just like SQL, but for unstructured blobs of text as a database and unstructured language as a query?

    I feel that's more a description of a search engine. Doesn't really give an intuition of why LLMs can do the things they do (beyond retrieval), or where/why they'll fail.

    • If you want actionable intuition, try "a human with almost zero self-awareness".

      "Self-awareness" used in a purely mechanical sense here: having actionable information about itself and its own capabilities.

      If you ask an old LLM whether it's able to count the Rs in "strawberry" successfully, it'll say "yes". And then you ask it to do so, and it'll say "2 Rs". It doesn't have the self-awareness to know the practical limits of its knowledge and capabilities. If it did, it would be able to work around the tokenizer and count the Rs successfully.

      That's a major pattern in LLM behavior. They have a lot of capabilities and knowledge, but not nearly enough knowledge of how reliable those capabilities are, or meta-knowledge that tells them where the limits of their knowledge lie. So, unreliable reasoning, hallucinations and more.

      2 replies →