Comment by simonw
6 days ago
Why does that seem weak to you?
It seems obviously true to me: code hallucinations are where the LLM outputs code with incorrect details - syntax errors, incorrect class methods, invalid imports etc.
If you have a strong linter in a loop those mistakes can be automatically detected and passed back into the LLM to get fixed.
Surely that's a solution to hallucinations?
It won't catch other types of logic error, but I would classify those as bugs, not hallucinations.
>It won't catch other types of logic error, but I would classify those as bugs, not hallucinations.
Let's go a step further, the LLM can produce bug free code too if we just call the bugs "glitches".
You are making a purely arbitrary decision on how to classify an LLM's mistakes based on how easy it is to catch them, regardless of their severity or cause. But simply categorizing the mistakes in a different bucket doesn't make them any less of a problem.
I don’t see why an LLM wouldn’t hallucinate project requirements or semantic interface contracts. The only way you could escape that is by full-blown formal verification and specification.
A good example of where a linter wouldn’t work is when the LLM has you import a package that doesn’t exist.