Comment by boron1006
10 hours ago
> A messy codebase is still cheaper to send ten agents through than to staff a team around. And even if the agents need ten days to reason through an unfamiliar system, that is still faster and cheaper than most development teams operating today.
I’ve been on 2 failed projects that have been entirely AI generated and it’s not that agents slow down and you can just send more agents to work on projects for longer, it’s that they becoming completely unable to make any progress whatsoever, and whatever progress they do make is wrong.
Same here. I have now deleted 43k and counting lines of my codebase. There is no point in putting any AI code into production anymore as it almost always uses none or the wrong abstractions.
When you try to throw more agents at the problem or even more verification layer, you just kill your agility even if they would still be able to work
>I’ve been on 2 failed projects that have been entirely AI generated and it’s not that agents slow down and you can just send more agents to work on projects for longer, it’s that they becoming completely unable to make any progress whatsoever, and whatever progress they do make is wrong.
This rhymes a lot with the Mythical Man Month. There's some corollary Mythical Machine Month thing going on with agent developed code at the moment.
The more I work with AIs (I build AI harnessing tools), the more I see similarities between the common attention failures that humans make. I forgot this one thing and it fucks everything up, or you just told me but I have too much in my mind as context that I forget that piece, or even in the case of Claude last night attesting to me while I am ordering it around that it cannot SSH into another server but I find it SSHing into said server about the 5th time I come back with traceback and it just fixes it!
All of these things human do, and i don't think we can attribute it directly to language itself, its attention and context and we both have the same issues.
Right, but when humans are writing the code, they have learned to focus on putting downward pressure on the complexity of the system to help mitigate this effect. I don't get the sense that agents have gotten there yet.
Big business LLMs even have the opposite incentive, to churn as many tokens as possible.
this is the part of the article that I did not sit well with me either. Code is agent generated, agent can debug it but will alway be human owned.
unless anthropic tomorrow comes in and takes ownership all the code claude generates, that is not changing..
Very much like humans when they drown in technical debt. I think the idea that a messy codebase can be magically fixed is laughable.
What I might believe though is that agents might make rewrites a lot more easy.
“Now we know what we were trying to build - let’s do it properly this time!”
It will make rewrite quicker, not "easier".
When the management recognize a tech debt, often it is too late that nobody understand the full requirement or know how things are supposed to work.
The AI agent will just make the same mistake human would make -- writing some half ass code that almost work but missing all sorts of edge case.
I was involved in a big re-write years ago. The boss finally put the old product on his desk with a sign "[boss's name]'s product owner" - that is when people asked how should this work the most common answer was exactly like the old version. 10 years latter the rewrite is a success, but it cost over a billion dollars. I have long suspected that billion dollars could have been better spend by just fixing technical debt.
Potentially, yes, but as with other software, you need to know AND have (automated) verifications on what it does, exactly.
And of course, make the case that it actually needs a rewrite, instead of maintenance. See also second-system effect.
> Potentially, yes, but as with other software, you need to know AND have (automated) verifications on what it does, exactly.
Yes, but even here one needs some oversight.
My experiments with Codex (on Extra High, even) was that a non-zero percentage of the "tests" involved opening the source code (not running it, opening it) and regexing for a bunch of substrings.
>And of course, make the case that it actually needs a rewrite, instead of maintenance.
"The AI said so ..."
I'm wondering how much value there is in a rewrite once you factor in that no one understands the new implementation as well as the old one.
Not only is it difficult to verify, but also the knowledge your team had of your messy codebase is now mostly gone. I would argue there is value in knowing your codebase and that you can't have the same level of understanding with AI generated code vs yours.
The point of a rewrite is to safely delete most of that arcane knowledge required to operate the old system, by reducing the operational complexity of it.
> “Now we know what we were trying to build - let’s do it properly this time!”
I wonder if AI will avoid the inevitable pitfalls their human predecessors make in thinking "if I could just rewrite from scratch I'd make a much better version" (only to make a new set of poorly understood trade offs until the real world highlights them aggressively)
That's correct, the more I work with AI the more it's obvious that all the good practice for humans is also beneficial for AI.
More modular code, strong typing, good documentation... Humans are bad at keeping too much in the short-term memory, and AI is even worse with their limited context window.
Is there a case for having more encapsulation? So a class and tests are defined and the LLM only works on that.
Agents run fast. Not always in the right direction. They benefit from a steady hand.