← Back to context

Comment by usrbinbash

9 months ago

> But agenty LLM configurations aren't just the LLM; they're also code that structures the LLM interactions. When the LLM behind a coding agent hallucinates a function, the program doesn't compile, the agent notices it, and the LLM iterates.

This describes the simplest, and most benign case of code assistents messing up. This isn't the problem.

The problem is when the code does compile, but contains logical errors, security f_ckups, performance dragdowns, or missed functionality. Because none of those will be caught by something as obvious as a compiler error.

And no, "let the AI write tests" wont catch them either, because that's not a solution, that's just kicking the can dwn the road...because if we cannot trust the AI to write correct code, why would we assume that it can write correct tests for that code?

What will ultimately catch those, is the poor sod in the data center, who, at 03:00 AM has to ring the on-call engineer out of his bed, because the production server went SNAFU.

And when the oncall then has to rely on "AI" to fix the mess, because he didn't actually write the code himself, and really doesn't even know the codebase any more (or even worse: Doesn't even understand the libraries and language used at all, because he is completely reliant on the LLM doing that for him), companies, and their customers, will be in real trouble. It will be the IT equivalent of attorneys showing up in court with papers containing case references that were hallucinated by some LLM.