Comment by ben_w

4 days ago

> Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.

While I'm not willing to rule *out* the idea that they're "thinking" (nor "conscious" etc.), the obvious counter-argument here is all the records we have of humans doing thinking, where the records themselves are not doing the thinking that went into creating those records.

And I'm saying this as someone whose cached response to "it's just matrix multiplication it can't think/be conscious/be intelligent" is that, so far as we can measure all of reality, everything in the universe including ourselves can be expressed as matrix multiplication.

Falsification, not verification. What would be measurably different if the null hypothesis was wrong?

I've definitely had AIs thinking and producing good answers about specific things that have definitely not been asked before on the internet. I think the stochastic parrot argument is well and truly dead by now.

  • I've also experienced this, to an extent, but on qualitative topics the goodness of an answer - beyond basic requirements like being parseable and then plausible - is difficult to evaluate.

    They can certainly produce good-sounding answers, but as to the goodness of the advice they contain, YMMV.

    • I've certainly got useful and verifiable answers. If you're not sure about something you can always ask it to justify it and then see if the arguments make sense.

  • how do you definitely know that?

    • Also, does it matter?

      The point being made here is about the data LLMs have been trained with. Sure that contains questions&answers but obviously not all of it is in that form. Just like an encyclopedie contains answers without the questions. So imo specifying this as 'no-one asked this before' is irrelevant.

      More interesting: did OP get a sensible answer to a question about data which definitely was not in the training set? (and indeed, how was this 'definitely' established'). Not that if the answer is 'yes' that'll prove 'thinking', as opposed to calling it e.g. advanced autocompletion, but it's a much better starting point.

    • Because I gave them a unique problem I had and it came up with an answer it definitely didn't see in the training data.

      Specifically I wanted to know how I could interface two electronic components, one of which is niche, recent, handmade and doesn't have any public documentation so there's no way it could have known about it before.

      4 replies →