Comment by tptacek
2 days ago
When you say "the LLM can easily hallucinate code that will satisfy the compiler but still fail the actual intent of the user", all you are saying is that the code will have bugs. My code has bugs. So does yours. You don't get to use the fancy word "hallucination" for reasonable-looking, readable code that compiles and lints but has bugs.
I think at this point our respective points have been made, and we can wrap it up here.
> When you say "the LLM can easily hallucinate code that will satisfy the compiler but still fail the actual intent of the user", all you are saying is that the code will have bugs. My code has bugs. So does yours. You don't get to use the fancy word "hallucination" for reasonable-looking, readable code that compiles and lints but has bugs.
There is an obvious and categorical difference between the "bugs" that an LLM produces as part of its generated code, and the "bugs" that I produce as part of the code that I write. You don't get to conflate these two classes of bugs as though they are equivalent, or even comparable. They aren't.
They obviously are.
I get that you think this is the case, but it really very much isn't. Take that feedback/signal as you like.
Hallucination is a fancy word?
The parent seems to be, in part, referring to "reward hacking", which tends to be used as a super category to what many refer to as slop, hallucination, cheating, and so on.
https://courses.physics.illinois.edu/ece448/sp2025/slides/le...