← Back to context

Comment by windexh8er

15 hours ago

> They are helping their users create things that didn't exist before.

That is a derived output. That isn't new as in: novel. It may be unique but it is derived from training data. LLMs legitimately cannot think and thus they cannot create in that way.

I will find this often-repeated argument compelling only when someone can prove to me that the human mind works in a way that isn't 'combining stuff it learned in the past'.

5 years ago a typical argument against AGI was that computers would never be able to think because "real thinking" involved mastery of language which was something clearly beyond what computers would ever be able to do. The implication was that there was some magic sauce that human brains had that couldn't be replicated in silicon (by us). That 'facility with language' argument has clearly fallen apart over the last 3 years and been replaced with what appears to be a different magic sauce comprised of the phrases 'not really thinking' and the whole 'just repeating what it's heard/parrot' argument.

I don't think LLM's think or will reach AGI through scaling and I'm skeptical we're particularly close to AGI in any form. But I feel like it's a matter of incremental steps. There isn't some magic chasm that needs to be crossed. When we get there I think we will look back and see that 'legitimately thinking' wasn't anything magic. We'll look at AGI and instead of saying "isn't it amazing computers can do this" we'll say "wow, was that all there is to thinking like a human".

  • > 5 years ago a typical argument against AGI was that computers would never be able to think because "real thinking" involved mastery of language which was something clearly beyond what computers would ever be able to do.

    Mastery of words is thinking? In that line of argument then computers have been able to think for decades.

    Humans don't think only in words. Our context, memory and thoughts are processed and occur in ways we don't understand, still.

    There's a lot of great information out there describing this [0][1]. Continuing to believe these tools are thinking, however, is dangerous. I'd gather it has something to do with logic: you can't see the process and it's non-deterministic so it feels like thinking. ELIZA tricked people. LLMs are no different.

    [0] https://archive.is/FM4y8 [0] https://www.theverge.com/ai-artificial-intelligence/827820/l... [1] https://www.raspberrypi.org/blog/secondary-school-maths-show...

    • Mastery of words is thinking?

      That's the crazy thing. Yes, in fact, it turns out that language encodes and embodies reasoning. All you have to do is pile up enough of it in a high-dimensional space, use gradient descent to model its original structure, and add some feedback in the form of RL. At that point, reasoning is just a database problem, which we currently attack with attention.

      No one had the faintest clue. Even now, many people not only don't understand what just happened, but they don't think anything happened at all.

      ELIZA, ROFL. How'd ELIZA do at the IMO last year?

      3 replies →

  • > I will find this often-repeated argument compelling only when someone can prove to me that the human mind works in a way that isn't 'combining stuff it learned in the past'.

    This is the definition of the word ‘novel’.

That is a pedantic distinction. You can create something that didn't exist by combining two things that did exist, in a way of combining things that already existed. For example, you could use a blender to combine almond butter and sawdust. While this may not be "novel", and it may be derived from existing materials and methods, you may still lay claim to having created something that didn't exist before.

For a more practical example, creating bindings from dynamic-language-A for a library in compiled-language-B is a genuinely useful task, allowing you to create things that didn't exist before. Those things are likely to unlock great happiness and/or productivity, even if they are derived from training data.

  • > That is a pedantic distinction. You can create something that didn't exist by combining two things that did exist, in a way of combining things that already existed.

    This is the definition of a derived product. Call it a derivative work if we're being pedantic and, regardless, is not any level of proof that LLMs "think".

  • Pedantic and not true. The LLM has stochastic processes involved. Randomness. That’s not old information. That’s newly generated stuff.

Yeah you’ve lost me here I’m sorry. In the real world humans work with AI tools to create new things. What you’re saying is the equivalent of “when a human writes a book in English, because they use words and letters that already exist and they already know they aren’t creating anything new”.

What does "think" mean?

Why is that kind of thinking required to create novel works?

Randomness can create novelty.

Mistakes can be novel.

There are many ways to create novelty.

Also I think you might not know how LLMs are trained to code. Pre-training gives them some idea of the syntax etc but that only gets you to fancy autocomplete.

Modern LLMs are heavily trained using reinforcement data which is custom task the labs pay people to do (or by distilling another LLM which has had the process performed on it).

Could you give us an idea of what you’re hoping for that is not possible to derive from training data of the entire internet and many (most?) published books?

  • This is the problem, the entire internet is a really bad set of training data because it’s extremely polluted.

    Also the derived argument doesn’t really hold, just because you know about two things doesn’t mean you’d be able to come up with the third, it’s actually very hard most of the time and requires you to not do next token prediction.

    • The emergent phenomenon is that the LLM can separate truth from fiction when you give it a massive amount of data. It can figure the world out just as we can figure it out when we are as well inundated with bullshit data. The pathways exist in the LLM but it won’t necessarily reveal that to you unless you tune it with RL.

      4 replies →

By that definition, nearly all commercial software development (and nearly all human output in general) is derived output.

  • Wow.

    You’re using ‘derived’ to imply ‘therefore equivalent.’ That’s a category error. A cookbook is derived from food culture. Does an LLM taste food? Can it think about how good that cookie tastes?

    A flight simulator is derived from aerodynamics - yet it doesn’t fly.

    Likewise, text that resembles reasoning isn’t the same thing as a system that has beliefs, intentions, or understanding. Humans do. LLMs don't.

    Also... Ask an LLM what's the difference between a human brain and an LLM. If an LLM could "think" it wouldn't give you the answer it just did.

    • Ask an LLM what's the difference between a human brain and an LLM. If an LLM could "think" it wouldn't give you the answer it just did.

      I imagine that sounded more profound when you wrote it than it did just now, when I read it. Can you be a little more specific, with regard to what features you would expect to differ between LLM and human responses to such a question?

      Right now, LLM system prompts are strongly geared towards not claiming that they are humans or simulations of humans. If your point is that a hypothetical "thinking" LLM would claim to be a human, that could certainly be arranged with an appropriate system prompt. You wouldn't know whether you were talking to an LLM or a human -- just as you don't now -- but nothing would be proved either way. That's ultimately why the Turing test is a poor metric.

    • You’re arguing against a straw man. No one is claiming LLMs have beliefs, intentions, or understanding. They don’t need them to be economically useful.

      1 reply →