← Back to context

Comment by philipp-gayret

8 days ago

The atrophy of manually writing code is certainly real. I'd compare it to using a paper map and a compass to navigate, versus say Google Maps. I don't particularly care to lose the skill, even though being good and enjoying the programming part of making software was my main source of income for more than a decade. I just can't escape being significantly faster with a Claude Code.

> he can tell what it generates is messy for long-term maintenance, even if it does work and even though he's new to React.

When one can generate code in such a short amount of time, logically it is not hard to maintain. You could just re-generate it if you didn't like it. I don't believe this style of argument where it's easy to generate with AI but then you cannot maintain it after. It does not hold up logically, and I have yet to see such a codebase where AI was able to generate it, but now cannot maintain it. What I have seen this year is feature-complete language and framework rewrites done by AI with these new tools. For me the unmaintainable code claim is difficult to believe.

have you tried using AI generated code in a non hobby project? one that has to go to production?

it just allucinates packages, adds random functions that already exist, creates new random APIs.

How is that not unmantainable?

  • We use it daily in our org. What you’re talking about is not happening. That being said, we have fairly decent mono repo structure, bunch of guides/skills to ensure it doesn’t do it that often. Also the whole plan + implement phases.

    If it was July 2025, I would have agreed with you. But not anymore.

  • I used to experience those issues a lot. I haven't in a while. Between having good documentation in my projects, well-defined skills for normal things, simple to use testing tools, and giving it clear requirements things go pretty smoothly.

    I'd say it still really depends on what you're doing. Are you working in a poorly documented language that few people use solving problems few people have solved? Are you adding yet another normal-ish kind of feature in a super common language and libraries? One will have a lot more pain than the other, especially if you're not supplying your own docs and testing tools.

    There's also just a difference of what to include in the context. I had three different projects which were tightly coupled. AI agents had a hard time keeping things straight as APIs changed between them, constantly misnaming them and getting parameters wrong and what not. Combining them and having one agent work all three repos with a shared set of documentation made it no longer make mistakes when it needed to make changes across multiple projects.

  • Yes, all the time. Yes, those go to production. AI has improved significantly the past 2 years, I highly recommend you give it another try.

    I don't see the behaviour you describe, maybe if your impression is that of online articles or you use a local llama model or ChatGPT from 2 years ago. Claude regularly finds and resolves duplicated code in fact. Let me give you a counter-example: For adding dependencies we run an internal whitelist for AI Agents; new dependencies go through this system, we had similar concerns. I have never seen any agent used in our organisation or at a client, in the half year or so that we run the service, hallucinate a dependency.

    • So where does your responsibility of this code end ? Do you just push to repo, merge and that's it or do you also deploy, monitor and maintain the production systems? Who handles outages on saturday night, is it you or someone else ?

  • FWIW I mainly use Opus 4.6 on the $100/mo Max plan, and rarely run into these issues. They certainly occur with lower-tier models, with increased frequency the cheaper the model is - as for someone using it for a significant portion of their professional and personal work, I don’t really understand why this continues to be a widespread issue. Thoroughly vetting Plan Mode output also seems like an easy resolution to this issue, which most devs should be doing anyways IMO (e.g. `npm install random-auth-package`).

LLMs rarely if ever proactively identify cleanup refactors that reduce the complexity of a codebase. They do, however, still happily duplicate logic or large blocks of markup, defer imports rather than fixing dependency cycles, introduce new abstractions for minimal logic, and freely accumulate a plethora of little papercuts and speed bumps.

These same LLMs will then get lost in the intricacies of the maze they created on subsequent tasks, until they are unable to make forward progress without introducing regressions.

You can at this point ask the LLM to rewrite the rat’s nest, and it will likely produce new code that is slightly less horrible but introduces its own crop of new bugs.

All of this is avoidable, if you take the wheel and steer the thing a little. But all the evidence I’ve seen is that it’s not ready for full automation, unless your user base has a high tolerance for bugs.

I understand Anthropic builds Claude Code without looking at the code. And I encounter new bugs, some of them quite obvious and bad, every single day. A Claude process starts at 200MB of RAM and grows from there, for a CLI tool that is just a bundle of file tools glued to a wrapper around an API!

I think they have a rats nest over there, but they’re the only game in town so I have to live with this nonsense.