← Back to context

Comment by mort96

21 hours ago

I'm sorry but the input to a model is a sequence of tokens and the output is a probability distribution of what's the most likely next token. It's a very very very fancy next token predictor but that is fundamentally what it is. I'm making the argument that this paradigm might not give rise to a general intelligence no matter how much you scale it.

It's a very very very fancy next token predictor

Yes, and unless you are prepared to rebut the argument with evidence of the supernatural, that's all there is, period. That's all we are.

So tired of the thought-terminating "stochastic parrot" argument.

  • Do LLMs even learn? The companies that build them build new models based partly on the conversations the older models have had with people, but do they incorporate knowledge into their neural nets as they go along?

    Can an LLM decide, without prompting or api calls, to text someone or go read about something or do anything at all except for waiting for the next prompt?

    Do LLMs have any conceptual understanding of anything they output? Do they even have a mechanism for conceptual understanding?

    LLMs are incredibly useful and I'm having a lot of fun working with them, but they are a long way from some kind of general intelligence, at least as far as I understand it.

    • "Do LLMs even learn?"

      They learned already a lot more than any of us will. Additinal to this, you have a prompt and you can teach it things in the prompt. Like if you give it examples how it should parse things, with examples in the prompt, it becomes better in doing it.

      I would say yes they learn.

      "Can an LLM decide" I would argue that you frame that wrong. If a LLM is the same thing as the pure language part of our brain, than the agent harness and the stuff around it, would be another part of our brain. I find it valid to use the LLM with triggers around it.

      Nonetheless, we probably can also design an architecture which has a loop build in.

      "Do LLMs have any conceptual understanding" Thats what a LLM has in their latent space. Basically to be able to predict the next token in such a compressed space, they 'invent' higher meaning in that space. You can ask a LLM about it actually.

      Yeah for AGI we are not there yet and we do not know how it will look like.

    • Yes, to all of your questions. You need to use a recent LLM in an agentic harness. Tell it to take notes, and it will.

      After a bit of further refinement, we'll start to call that process "learning." Eventually the question of who owns the notes, who gets to update them, and how, will become a huge, huge deal.

  • I'm not sure why you think you know the human brain works through predicting the next token.

    It's not supernatural, I believe that an artificial intelligence is possible because I believe human intelligence is just a clever arrangement of matter performing computation, but I would never be presumptuous enough to claim to know exactly how that mechanism works.

    My opinion is that human intelligence might be what's essentially a fancy next token predictor, or it might work in some completely different way, I don't know. Your claim is that human intelligence is a next token predictor. It seems like the burden on proof is on you.

    • > Your claim is that human intelligence is a next token predictor.

      Literally it is, at least in many of its forms.

      You accepted CamperBob2’s text as input and then you generated text as output. Unless you are positing that this behavior cannot prove your own general intelligence, it seems plain that “next token generator” is sufficient for AGI. (Whether the current LLM architecture is sufficient is a slightly different question.)

      25 replies →