← Back to context

Comment by roadside_picnic

18 hours ago

The real danger of the word "hallucination" is it implies that the model knows what's real and erroneously produced a result that is not. All LLM output is basically an interpolation, most people just aren't used to thinking of "words" as something that can be the result of interpolation.

Imagine The real high temperature for 3 days was: 80F on Monday, 100F on Tuesday, 60F on Wednesday. But if I'm missing Tuesday, a model might interpolate based on Monday and Wednesday that it was 70F. This would be very wrong, but it would be pretty silly to say that my basic model was "hallucinating". Rather we would correctly conclude that either the model doesn't have enough information or lacks the capacity to correctly solve the problem (or both).

LLMs "hallucinations" are caused by the same thing: either the model lacks the necessary information, or the model simply can't correctly interpolate all the time (this possibility I suspect is the marketing reason why people stick to 'hallucinate', because it implies its a temporary problem not a fundamental limitation). This is also why tweaking prompts should not be used as an approach to fixing "hallucinations" because one is just jittering the input a bit until the model gets it "right".

That's the exact opposite of what the term "hallucination" is intended to imply. If it knew what was real and produced the wrong result anyway that would be a lie, not a hallucination.

I've heard the term "confabulation" as potentially more accurate than "hallucination", but it never really caught on.

  • "Confabulation" would never catch on because it's a word that most people don't know and couldn't remember. "Hallucination" is easier to remember, easier to understand, and easier for laypersons to build a mental model of.