← Back to context

Comment by coldtea

8 hours ago

What part of "Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all the computable functions and will therefore inevitably hallucinate if used as general problem solvers. " doesn't carry the title, to ask mildly?

As with all the works that use too broad a definition of an LLM they prove too much. This work defines an "LLM" as a computable function obtained by applying a finite number of steps of a generic algorithm to an initial computable function.

What they really prove is that it's impossible to extrapolate unconstrained non-continuous function from a finite subset of its values. Good for them, I guess.

It's like saying that the no free lunch theorems proves that LLMs can't be the best optimizers, while it proves (roughly) that the best optimizers don't exists. That is, even people aren't the best optimizers, but we manage somehow, so LLMs can too.

I don’t agree with that definition of “hallucination”, for starters.

  • So substitute another phrase, if you prefer. It doesn't change the logic.

    "Specifically, we define a formal world where bungling is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all the computable functions and will therefore inevitably bungle if used as general problem solvers."

    • Their diagonalization argument applies to any system that uses finite training data. Calling such a system "LLM" is an (unintentional) red herring.

      1 reply →