You did not make a negative critique. You completely dismissed the value of coding agents on the basis that the results are not predictable, which is both obvious and doesn’t matter in practice. Anyone who has given these tools a chance will quickly realise that 1) they are actually quite predictable in doing what you ask them to, and 2) them being non-deterministic does not at all negate their value. This is why people can immediately tell you haven’t used these tools, because your argument as to why they’re useless is so elementary.
It’s also possible that people more experienced, knowledgable and skilled than you can see fundamental flaws in using LLMs for software engineering that you cannot. I am not including myself in that category.
I’m personally honestly undecided. I’ve been coding for over 30 years and know something like 25 languages. I’ve taught programming to postgrad level, and built prototype AI systems that foreshadowed LLMs, I’ve written everything from embedded systems to enterprise, web, mainframes, real time, physics simulation and research software. I would consider myself an 7/10 or 8/10 coder.
A lot of folks I know are better coders. To put my experience into context: one guy in my year at uni wrote one of the world’s most famous crypto systems; another wrote large portions of some of the most successful games of the last few decades. So I’ve grown up surrounded by geniuses, basically, and whilst I’ve been lectured by true greats I’m humble enough to recognise I don’t bleed code like they do. I’m just a dabbler. But it irks me that a lot of folks using AI profess it’s the future but don’t really know anything about coding compared to these folks. Not to be a Luddite - they are the first people to adopt new languages and techniques, but they also are super sceptical about anything that smells remotely like bullshit.
One of the most wise insights in coding is the aphorism“beware the enthusiasm of the recently converted.” And I see that so much with AI. I’ve seen it with compilers, with IDEs, paradigms, and languages.
I’ve been experimenting a lot with AI, and I’ve found it fantastic for comprehending poor code written by others. I’ve also found it great for bouncing ideas. And the code it writes, beyond boiler plate, is hot garbage. It doesn’t properly reason, it can’t design architecture, it can’t write code that is comprehensible to other programmers, and treating it as a “black box to be manipulated by AI” just leads to dead ends that can’t be escaped, terrible decisions that will take huge amounts of expert coding time to undo, subtle bugs that AI can’t fix and are super hard to spot, and often you can’t understand their code enough to fix them, and security nightmares.
Testing is insufficient for good code. Humans write code in a way that is designed for general correctness. AI does not, at least not yet.
I do think these problems can be solved. I think we probably need automated reasoning systems, or else vastly improved LLMs that border on automated reasoning much like humans do. Could be a year. Could be a decade. But right now these tools don’t work well. Great for vibe coding, prototyping, analysis, review, bouncing ideas.
Yes, that is the analogy I am making. People argued that bicycles (a tool for humans to use) could not possibly work - even as people were successfully using them.
Please tell me which one of the headings is not about increased usage o LLMs and derived tools and is about some improvement in the axes of reliability or or any kind of usefulness.
I know it seems like forever ago, but claude code only came out in 2025.
Its very difficult to argue the point that claude code:
1) was a paradigm shift in terms of functionality, despite, to be fair, at best, incremental improvements in the underlying models.
2) The results are an order of magnitude, I estimate, better in terms of output.
I think its very fair to distill “AI progress 2025” to: you can get better results (up to a point; better than raw output anyway; scaling to multiple agents has not worked) without better models with clever tools and loops. (…and video/image slop infests everything :p).
Whenever someone tells me that AI is worthless, does nothing, scam/slop etc, I ask them about their own AI usage, and their general knowledge about what's going on.
Invariably they've never used AI, or at most very rarely. (If they used AI beyond that, this would be admission that it was useful at some level).
Therefore it's reasonable to assume that you are in that boat. Now that might not be true in your case, who knows, but it's definitely true on average.
It's not worthless, it's just not worldchanging as is even in the fields where it's most useful, like programming. If the trajectory changes and we reach AGI then this changes too but right now it's just a way to
- fart out demos that you don't plan on maintaining, or want to use as a starting place
- generate first-draft unit tests/documentation
- generate boilerplate without too much functionality
- refactor in a very well covered codebase
It's very useful for all of the above! But it doesn't even replace a junior dev at my company in its current state. It's too agreeable, makes subtle mistakes that it can't permanently correct (GEMINI.md isn't a magic bullet, telling it to not do something does not guarantee that it won't do it again), and you as the developer submitting LLM-generated code for review need to review it closely before even putting it up (unless you feel like offloading this to your team) to the point that it's not that much faster than having written it yourself.
You did not make a negative critique. You completely dismissed the value of coding agents on the basis that the results are not predictable, which is both obvious and doesn’t matter in practice. Anyone who has given these tools a chance will quickly realise that 1) they are actually quite predictable in doing what you ask them to, and 2) them being non-deterministic does not at all negate their value. This is why people can immediately tell you haven’t used these tools, because your argument as to why they’re useless is so elementary.
People denied that bicycles could possibly balance even as others happily pedaled by. This is the same thing.
people also said that selling jpegs of monkeys for millions of dollars was a pump and dump scam, and would collapse
they were right
JPEGs with no value other than fake scarcity is very different to coding agents that people actively use to ship real code.
It’s possible this is correct.
It’s also possible that people more experienced, knowledgable and skilled than you can see fundamental flaws in using LLMs for software engineering that you cannot. I am not including myself in that category.
I’m personally honestly undecided. I’ve been coding for over 30 years and know something like 25 languages. I’ve taught programming to postgrad level, and built prototype AI systems that foreshadowed LLMs, I’ve written everything from embedded systems to enterprise, web, mainframes, real time, physics simulation and research software. I would consider myself an 7/10 or 8/10 coder.
A lot of folks I know are better coders. To put my experience into context: one guy in my year at uni wrote one of the world’s most famous crypto systems; another wrote large portions of some of the most successful games of the last few decades. So I’ve grown up surrounded by geniuses, basically, and whilst I’ve been lectured by true greats I’m humble enough to recognise I don’t bleed code like they do. I’m just a dabbler. But it irks me that a lot of folks using AI profess it’s the future but don’t really know anything about coding compared to these folks. Not to be a Luddite - they are the first people to adopt new languages and techniques, but they also are super sceptical about anything that smells remotely like bullshit.
One of the most wise insights in coding is the aphorism“beware the enthusiasm of the recently converted.” And I see that so much with AI. I’ve seen it with compilers, with IDEs, paradigms, and languages.
I’ve been experimenting a lot with AI, and I’ve found it fantastic for comprehending poor code written by others. I’ve also found it great for bouncing ideas. And the code it writes, beyond boiler plate, is hot garbage. It doesn’t properly reason, it can’t design architecture, it can’t write code that is comprehensible to other programmers, and treating it as a “black box to be manipulated by AI” just leads to dead ends that can’t be escaped, terrible decisions that will take huge amounts of expert coding time to undo, subtle bugs that AI can’t fix and are super hard to spot, and often you can’t understand their code enough to fix them, and security nightmares.
Testing is insufficient for good code. Humans write code in a way that is designed for general correctness. AI does not, at least not yet.
I do think these problems can be solved. I think we probably need automated reasoning systems, or else vastly improved LLMs that border on automated reasoning much like humans do. Could be a year. Could be a decade. But right now these tools don’t work well. Great for vibe coding, prototyping, analysis, review, bouncing ideas.
But right now these tools don’t work well. Great for vibe coding, prototyping, analysis, review, bouncing ideas.
What are some of the models you've been working with?
People did?
Bicycles don't balance, the human on the bicycle is the one doing the balancing.
Yes, that is the analogy I am making. People argued that bicycles (a tool for humans to use) could not possibly work - even as people were successfully using them.
1 reply →
Bicycles (without a rider) do balance at sufficient speed via a self steering and correction mechanism of the front axle..
[flagged]
Please tell me which one of the headings is not about increased usage o LLMs and derived tools and is about some improvement in the axes of reliability or or any kind of usefulness.
Here is the changelog for OpenBSD 7.8:
https://www.openbsd.org/78.html
There's nothing here that says: We make it easier to use it more of it. It's about using it better and fixing underlying problems.
The coding agent heading. Claude Code and tools like it represent a huge improvement in what you can usefully get done with LLMs.
Mistakes and hallucinations matter a whole lot less if a reasoning LLM can try the code, see that it doesn't work and fix the problem.
4 replies →
I know it seems like forever ago, but claude code only came out in 2025.
Its very difficult to argue the point that claude code:
1) was a paradigm shift in terms of functionality, despite, to be fair, at best, incremental improvements in the underlying models.
2) The results are an order of magnitude, I estimate, better in terms of output.
I think its very fair to distill “AI progress 2025” to: you can get better results (up to a point; better than raw output anyway; scaling to multiple agents has not worked) without better models with clever tools and loops. (…and video/image slop infests everything :p).
3 replies →
Whenever someone tells me that AI is worthless, does nothing, scam/slop etc, I ask them about their own AI usage, and their general knowledge about what's going on.
Invariably they've never used AI, or at most very rarely. (If they used AI beyond that, this would be admission that it was useful at some level).
Therefore it's reasonable to assume that you are in that boat. Now that might not be true in your case, who knows, but it's definitely true on average.
It's not worthless, it's just not worldchanging as is even in the fields where it's most useful, like programming. If the trajectory changes and we reach AGI then this changes too but right now it's just a way to
- fart out demos that you don't plan on maintaining, or want to use as a starting place
- generate first-draft unit tests/documentation
- generate boilerplate without too much functionality
- refactor in a very well covered codebase
It's very useful for all of the above! But it doesn't even replace a junior dev at my company in its current state. It's too agreeable, makes subtle mistakes that it can't permanently correct (GEMINI.md isn't a magic bullet, telling it to not do something does not guarantee that it won't do it again), and you as the developer submitting LLM-generated code for review need to review it closely before even putting it up (unless you feel like offloading this to your team) to the point that it's not that much faster than having written it yourself.
because your "negative critique" is just idiotic and wrong