Comment by dkdbejwi383

4 months ago

How would an LLM “know” when it isn’t sure? Their baseline for truth is competent text, they don’t have a baseline for truth based on observed reality. That’s why they can be “tricked” into things like “Mr Bean is the president of the USA”

15 comments

dkdbejwi383

JustFinishedBSG 4 months ago

It would "know" the same way it "knows" anything else: The probability of the sequence "I don't know" would be higher than the probability of any other sequence.

Sharlin 4 months ago

Exactly. It's easy to imagine a component in the net that the model is steered towards when nothing else has a high enough activation.

ben_w 4 months ago

The answer is the same as how the messy bag of chemistry that is the human brain "knows" when it isn't sure:

Badly, and with great difficulty, so while it can just about be done, even then only kinda.

foldr 4 months ago
We really don’t understand the human brain well enough to have confidence that the mechanisms that cause people to respond with “I don’t know” are at all similar to the mechanisms which cause LLMs to give such responses. And there are quite a few prima facie reasons to think that they wouldn’t be the same.
- ben_w 4 months ago
  
  FWIW, I'm describing failure modes of a human, not mechanisms.
  I also think "would" in the comment I'm replying to is closer to "could" than to "does".
  
  5 replies →
- Sharlin 4 months ago
  
  The mechanics don't have to be similar, only analogous, in the morphology sense.
  
  3 replies →

saberience 4 months ago

Humans can just as easily be tricked. Something like 25% of the American Electorate believed Obama was the antichrist.

So saying LLMs have no "baseline for truth" doesn't really mean much one way of the other, they are much smart and accurate than 99% of humans.