Comment by ffsm8
2 years ago
This is confusing to read
If I agree with your definition of hallucinations in the context of LLMs... Then isn't your second paragraph literally just a way to artificially increase the likelihood of them occurring?
You seem to differentiate between a hallucination caused by poisoning the dataset vs a hallucination caused by correct data, but can you honestly make such a distinction considering just how much data goes into these models?
Yes, I can make such a distinction - if what the LLM is producing is in the training data then it's not a "hallucination". Note that this is an entirely separate problem from whether the LLM is "correct". In other words, I'm treating the LLM as a Chronicler, summarizing and reproducing what others have previously written, rather than as a Historian trying to determine the underlying truth of what occurred.