Comment by jll29
2 months ago
AI professor here. I know this page is a joke, but in the interest of accuracy, a terminological comment: we don't call it a "hallucination" if a model complies exactly with what a prompt asked for and produces a prediction, exactly as requested.
Rater, "hallucinations" are spurious replacements of factual knowledge with fictional material caused by the use of statistical process (the pseudo random number generator used with the "temperature" parameter of neural transformers): token prediction without meaning representation.
[typo fixed]
(I should have thought of this yesterday but have just replaced 'hallucinates' with 'imagines' in the title...though one could object to that too...)
I agree with your first paragraph, but not your second. Models can still hallucinate when temperature is set to zero (aka when we always choose the highest probability token from the model's output token distribution).
In my mind, hallucination is when some aspect of the model's response should be consistent with reality but is not, and the reality-inconsistent information is not directly attributable or deducible from (mis)information in the pre-training corpus.
While hallucination can be triggered by setting the temperature high, it can also be the result of many possible deficiencies in model pre- and post- training that result in the model outputting bad token probability distributions.
> In my mind, hallucination is when some aspect of the model's response should be consistent with reality
By "reality", do you mean the training corpus? Because otherwise, this seems like a strange standard. Models don't have access to "reality".
> Models don't have access to "reality"
This is an explanation of why models "hallucinate" not a criticism for the provided definition of hallucination.
1 reply →
I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus. For frontier models, we don't even have access to the enormous training corpus, so we would have no way of verifying whether or not it is regurgitating some misinformation that it had seen there or whether it is inventing something out of whole cloth.
> I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus.
If the LLM is accurately reflecting the training corpus, it wouldn’t be considered a hallucination. The LLM is operating as designed.
Matters of access to the training corpus are a separate issue.
7 replies →
That's because of rounding errors
i agree, not just the multinomial sampling that causes hallucinations. If that were the case, setting temp to 0 and just argmax over the logits would "solve" hallucinations. while round-off error causes some stochasticity it's unlikely to be the the primary cause, rather it's lossy compression over the layers that causes it.
first compression: You create embeddings that need to differentiate N tokens, JL lemma gives us a bound that modern architectures are well above that. At face value, the embeddings could encode the tokens and provide deterministic discrepancy. But words aren't monolithic , they mean many things and get contextualized by other words. So despite being above jl bound, the model still forces a lossy compression.
next compression: each layer of the transformer blows up the input to KVQ, then compresses it back to the inter-layer dimension.
finally there is the output layer which at 0 temp is deterministic, but it is heavily path dependent on getting to that token. The space of possible paths is combinatorial, so any non-deterministic behavior elsewhere will inflate the likelihood of non-deterministic output, including things like roundoff. heck most models are quantized down to 4 even2 bits these days, which is wild!
"Hallucination" has always seemed like a misnomer to me anyway considering LLMs don't know anything. They just impressively get things right enough to be useful assuming you audit the output.
If anything, I think all of their output should be called a hallucination.
We don't know if anything knows anything because we don't know what knowing is.
On the other hand, once you're operating under the model of not knowing if anything knows anything, there's really no point in posting about it here, is there?
This is just something that sounds profound but really isn’t.
Knowing is actually the easiest part to define and explain. Intelligence / understanding is much more difficult to define.
1 reply →
Others have suggested "bullshit". A bullshitter does not care (and may not know) whether what they say is truth or fiction. A bullshitter's goal is just to be listened to and seem convincing.
The awareness of the bullshitter is used to differentiate between 'hard' and 'soft' bullshit. https://eprints.gla.ac.uk/327588/1/327588.pdf
> "Hallucination" has always seemed like a misnomer to me anyway considering LLMs don't know anything. They just impressively get things right enough to be useful assuming you audit the output.
If you pick up a dictionary and review the definition of "hallucination", you'll see something in the lines of "something that you see, hear, feel or smell that does not exist"
https://dictionary.cambridge.org/dictionary/english/hallucin...
Your own personal definition arguably reinforces the very definition of hallucination. Models don't get things right. Why? Because their output contrasts with content covered by their corpus, thus outputting things that don't exist or were referred in it and outright contrast with factual content.
> If anything, I think all of their output should be called a hallucination.
No. Only the ones that contrast with reality, namely factual information.
Hence the term hallucination.
Want to second this. Asking the model to create a work of fiction and it complying isn't a pathology. Mozart wasn't "hallucinating" when he created "The Marriage of Figaro".
But many artists are hallucinating when they envisioned some of their pieces. Who's to say Mozart wasn't on a trip when he created The Marriage of Figaro.
Bill Atkinson was hallucinating when he envisioned HyperCard.
https://news.ycombinator.com/item?id=44530767
That would have to be a very very long hallucination because it’s a huge opera that took a long time to write.
We don't know Mozart's state of mind when he composed.
He didn't hallucinate the Marriage of Figaro but he may well have been hallucinating.
Terminology-wise, does this read like a better title instead?:
Show HN: Gemini Pro 3 generates the HN front page 10 years from now
> Terminology-wise, does this read like a better title instead?:
Generates does not convey any info on the nature of the process used to create the output. In this context, extrapolates or predicts or explores sound more suitable.
But nitpicking over these words is pointless and represents going off on a tangent. The use of the term "hallucination" reffers to the specific mechanism used to generate this type of output. Just like prompting a model to transcode a document and thus generating an output that doesn't match any established format.
I'd vote for imagines.
Wouldnt confabulate/confabulations be a better term in substitute for "hallucinating"?
The OP clearly didn't mean "hallucination" as a bug or error in the AI, in the way you're suggesting. Words can have many different meanings!
You can easily say, Johnny had some wild hallucinations about a future where Elon Musk ruled the world. It just means it was some wild speculative thinking. I read this title in this sense of the world.
Not everything has to be nit-picked or overanalysed. This is an amusing article with an amusing title.
Exactly! At first this is the precise reason I didn't click through as I thought from the title, a page must have been somehow outputted/hallucinated by error, but luckily I then saw the number of votes, revised my choice and saw a great page.
I'm partial though, loving Haskell myself (as a monad_lover) i'm happy it wasn't forgotten too :)
To me, “imagine” would have been a more fitting term here.
(“Generate”, while correct, sounds too technical, and “confabulate” reads a bit obscure.)
"imagine" gives too much credence of humanity to this action which will continue the cognitive mistake we make of anthropomorphzing llms
In French we call that kind of practices "affabulations". Maybe fraud, deception or deceit are the closest matching translations for this context.
That is what the LLM are molded to do (of course). But this is also the insistence by informed people to unceasingly use fallacious vocabulary. Sure a bit of analogy can be didactic, but the current trend is rather to leverage on every occasion to spread the impression that LLM works with processes similar to human thoughts.
A good analogy also communicate the fact that it is a mere analogy. So carrying the metaphor is only going to accumulate more delusion than comprehension.
Interesting.
There are AI professors out there already!
wouldn't the right term be 'confabulation'?
Latin: Extraclaudiposteriorifabricátio
Pronunciation: ex-tra-clau-dee-pos-TE-ri-o-ri-fa-bri-KA-tee-o
Meaning: "The act of fabricating something by pulling it from one’s posterior."
German: Poausdenkungsherausziehungsmachwerk
Pronunciation: POH-ows-den-kungs-heh-RAUS-tsee-oongs-MAHKH-verk
Meaning: "A contrived creation pulled out of the butt by thinking it up."
Klingon: puchvo’vangDI’moHchu’ghach
Pronunciation: POOKH-vo vang-DEE-moakh-CHU-ghakh (roll the gh, hit the q hard, and use that throat ch like clearing your bat’leth sinuses)
Meaning: "The perfected act of boldly claiming something pulled out from the butt."
No, still too negatively connoted. "Writes" "Predicts" "caricatures" is closer.
Have we abandoned the term "generate" already?
1 reply →