Comment by wizzwizz4
11 hours ago
From the article:
> There's a common rebuttal to this, and I hear it constantly. "Just wait," people say. "In a few months, in a year, the models will be better. They won't hallucinate. They won't fake plots. The problems you're describing are temporary." I've been hearing "just wait" since 2023.
We're not trending towards superintelligence with these AIs. We're trending towards (and, in fact, have already reached) superintelligence with computers in general, but LLM agents are among the least capable known algorithms for the majority of tasks we get them to do. The problem, as it usually is, is that most people don't have access to the fruits of obscure research projects.
Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.
> Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.
I’ll take that bet. How much money would you like to put on this, and we’ll have a neutral third party pick both the untrained child and the LLM.
Let me know.
The rate of hallucination has gone down drastically since 2023. As LLM coding tools continue to pare that rate down, eventually we’ll hit a point where it is comparable to the rate we naturally introduce bugs as humans programmers.
I wonder how much of the decrease in hallucination is because the models are getting better, and how much is because these massively over-funded companies are adding a bunch of one-off shims at breakneck speed. IE - are they truly improving the cognition, or just monkey-patching the hell out of it?
The recent article where the AI companies are paying experts in the field to help train the models makes me wonder if they're also manually fixing a bunch of post-processing errors as they come up.
LLMs are still making fundamentally the same kinds of errors that they made in 2021. If you check my HN comment history, you'll see I predicted these errors, just from skimming the relevant academic papers (which is to say they're obvious: I'm far from the only person saying this). There is no theoretical reason we should expect them to go away, unless the model architectures fundamentally change (and no, GPT -> LLaMA is not a fundamental change), because they're not removable discontinuities: they're indicative of fundamental capability gaps.
I don't care how many terms you add to your Taylor series: your polynomial approximation of a sine wave is never going to be suitable for additive speech synthesis. Likewise, I don't care how good your predictive-text transformer model gets at instrumental NLP subtasks: it will never be a good programmer (except as far as it's a plagiarist). Just look at the Claude Code source code: if anyone's an expert in agentic AI development, it's the Claude people, and yet the codebase is utterly unmaintainable dogshit that shouldn't work and, on further inspection, doesn't work.
That's not to say that no computer program can write computer programs, but this computer program is well into the realm of diminishing returns.