Comment by ffsm8

2 years ago

This is confusing to read

If I agree with your definition of hallucinations in the context of LLMs... Then isn't your second paragraph literally just a way to artificially increase the likelihood of them occurring?

You seem to differentiate between a hallucination caused by poisoning the dataset vs a hallucination caused by correct data, but can you honestly make such a distinction considering just how much data goes into these models?

Yes, I can make such a distinction - if what the LLM is producing is in the training data then it's not a "hallucination". Note that this is an entirely separate problem from whether the LLM is "correct". In other words, I'm treating the LLM as a Chronicler, summarizing and reproducing what others have previously written, rather than as a Historian trying to determine the underlying truth of what occurred.