Comment by ajross

8 hours ago

> it's reductive to just call LLMs "bullshit machines" as if the models are not improving

This is true, but I prefer to think of it as "It's delusional to pretend as if human beings are not bullshit machines too".

Lies are all we have. Our internal monologue is almost 100% fantasy. Even in serious pursuits, that's how it works. We make shit up and lie to ourselves, and then only later apply our hard-earned[1] skill prompts to figure out whether or not we're right about it.

How many times have the nerds here been thinking through a great new idea for a design and how clever it would be before stopping to realize "Oh wait, that won't work because of XXX, which I forgot". That's a hallucination right there!

[1] Decades of education!

I'm not entirely sure I can agree, although the premise is seductive in certain ways. We do lie to ourselves, but we also have meta-cognition - we can recognise our own processes of thought. Imperfect as it may be, we have feedback loops which we can choose to use, we have heuristics we can apply, we can consciously alter our behaviour in the presence of contextual inputs, and so on.

Being wrong is not the same as a hallucination. It's a natural step on a journey to being more right. This feels a bit like Andreesen proudly stating he avoids reflection - you can act like that, but the human brain doesn't have to. LLMs have no choice in the matter.

The problem, unfortunately, is the scale. It's always scale. Humans make all the kinds of mistakes that we ascribe to LLMs, but LLMs can make them much faster and at much larger scale.

Models have gotten ridiculously better, they really have, but the scale has increased too, and I don't think we're ready to deal with the onslaught.

  • Scale is very different, but I wonder if human trust isn't the real issue. We trust technology too much as a group. We expect perfection, but we also assume perfection. This might be because the machines output confident sounding answers and humans default to trusting confidence as an indirect measure for accuracy, but I think there is another level where people just blindly trust machines because they are so use to using them for algorithms that trend towards giving correct responses.

    Even before LLMs where in the public's discourse, I would have business ask about using AI instead of building some algorithm manually, and when I asked if they had considered the failure rate, they would return either blank stares or say that would count as a bug. To them, AI meant an algorithm just as good as one built to handle all edge cases in business logic, but easier and faster to implement.

    We can generally recognize the AIs being off when they deal in our area of expertise, but there is some AI variant of Gell-Mann Amnesia at play that leads us to go back to trusting AI when it gives outputs in areas we are novices in.

"Lies are all we have."

If so, how do we distinguish between code that works and code that doesn't work? Why should we even care?

  • > If so, how do we distinguish between code that works and code that doesn't work?

    Hilariously, not by using our brains, that's for sure. You have to have an external machine. We all understand that "testing" and "code review" are different processes, and that's why.

    • Good point. We choose certain tests to perform. We choose certain test results to pay attention to. We don't just keep chatting about (reviewing) the code. We do something else.

      If lies are all we have, then how is this behavior possible?

      2 replies →

So your logic is humans and LLMs are the same because humans are wrong sometimes?

  • Pretty much, yeah. Or rather, the fact that we're both reliably wrong in identifiably similar ways makes "we're more alike than different" an attractive prior to me.

    • “More alike than different” is reasonable I think, as long as we’re talking about how we have some of the same failure modes. Although the way we get there is quite different.

      I’m still not a big fan of comparing humans and LLMs because LLMs lack so much of what actually makes us human. We might bullshit or be wrong because of many reasons that just don’t apply to LLMs.

Humans are different. Humans - at least thoughtful humans - know the difference between knowing something and not knowing something. Humans are capable of saying "I don't know" - not just as a stream of tokens, but really understanding what that means.

  • > Humans - at least thoughtful humans - know the difference between knowing something and not knowing something.

    Your no-true-scotsman clause basically falsifies that statement for me. Fine, LLMs are, at worst I guess, "non-thoughtful humans". But obviously LLMs are right an awful lot (more so than a typical human, even), and even the thoughtful make mistakes.

    So yeah, to my eyes "Humans are NOT different" fits your argument better than your hypothesis.

    (Also, just to be clear: LLMs also say "I don't know", all the time. They're just prompted to phrase it as a criticism of the question instead.)

    • Disagree. If you went to 100 random humans and said, "Tell me about the Siberian marmoset", what fraction would make up completely random nonsense to spew back at you? More than zero, sure, but most of them would say "what are you talking about?" or some variation.

      1 reply →