Comment by svara

16 hours ago

I'm a fan of AI coding tools but the trend of adding ever more autonomy to agents confuses me.

The rate at which a person running these tools can review and comprehend the output properly is basically reached with just a single thread with a human in the loop.

Which implies that this is not intended to be used in a setting where people will be reading the code.

Does that... Actually work for anyone? My experience so far with AI tools would have me believe that it's a terrible idea.

Yes, this actually works. In 2026, software engineering is going to change a great deal as a result, and if you're not at least experimenting with this stuff to learn what it's capable of, that's a red flag for your career prospects.

I don't mean this in a disparaging way. But we're at a car-meets-horse-and-buggy moment and it's happening really quickly. We all need to at least try driving a car and maybe park the horse in the stable for a few hours.

  • The FOMO nonsense is really uncalled for. If everything is going to be vibecoded in the future then either theres going to be a million code-unfucking jobs or no jobs at all.

    Attitudes like that, where you believe that the richeous AI pushers will be saved from the coming rapture meanwhile everyone else will be out on the streets, really make people hate the AI crowd

    • The comment you’re replying to is actually very sensible and non-hypey. I wouldn’t even categorize it as particularly pro-AI, considering how ridiculous some of the frothing pro-AI stuff can get.

It works for me, in that I don't care about all the intermediate babble ai generates. What matters is the final changelist before hitting commit... going through that, editing it, fixing comments, etc. But holding it's hand while it deals with LSP issues of a logger not being visible sometimes, is just not something I see a reason to waste my time with.

  • After I have wrote a feature and I’m in the ironing out bug stage this is where I like the agents do a lot of the grunt work, I don’t want to write jsdocs, or fix this lint issue.

    I have also started it in writing tests.

    I will write the first test the “good path” it can copy this and tweak the inputs to trigger all the branches far faster than I can.

It likely is acceptable for business-focused code. Compared to a lot of code written by humans, even if the AI code is less than optimal, it's probably better quality than what many humans will write. I think we can all share some horror stories of what we've seen pushed to production.

Executives/product managers/sales often only really care about getting the product working well enough to sell it.

> The rate at which a person running these tools can review and comprehend the output properly is basically reached with just a single thread with a human in the loop.

That's what you're missing -- the key point is, you don't review and comprehend the output! Instead, you run the program and then issue prompts like this (example from simonw): "fix in and get it to compile" [0]. And I'm not ragging on this at all, this is the future of software development.

[0] https://gisthost.github.io/?9696da6882cb6596be6a9d5196e8a7a5...

  • It's a bit like the argument with self driving cars though. They may be safer overall, but there's a big difference in how responsibility for errors is attributed. If a human is not a decision maker in the production of the code, where does responsibility for errors propagate to?

    I feel like software engineers are taking a lot of license with the idea that if something bad happens, they will just be able to say "oh the AI did it" and no personal responsibility or liability will attribute. But if they personally looked at the code and their name is underneath it signing off the merge request acknowledging responsibility for it - we have a very different dynamic.

    Just like artists have to re-conceptualise the value of what they do around the creative part of the process, software engineers have to rethink what their value proposition is. And I'm seeing a large part of it is, you are going to take responsibility for the AI output. It won't surprise me if after the first few disasters happen, we see liability legislation that mandates human responsibility for AI errors. At that point I feel many of the people all in on agent driven workflows that are explicitly designed to minimise human oversight are going to find themselves with a big problem.

    My personal approach is I'm building up a tool set that maximises productivity while ensuring human oversight. Not just that it occurs and is easy to do, but that documentation of it is recorded (inherently, in git).

    It will be interesting to see how this all evolves.

  • I've commented on this before, but issuing a prompt like "Fix X" makes so many assumptions (like a "behaviorism" approach to coding) including that the bug manifests in both an externally and consistently detectable way, and that you notice it in the first place. TDD can reduce this but not eliminate it.

    I do a fair amount of agentic coding, but always periodically review the code even if it's just through the internal diff tool in my IDE.

    Approximately 4 months ago Sonnet 4.5 wrote this buried deep in the code while setting up a state machine for a 2d sprite in a relatively simple game:

      // Pick exit direction (prefer current direction)
      const exitLeft = this.data.direction === Direction.LEFT || Math.random() < 0.5;
    

    I might never have even noticed the logical error but for Claude Code attaching the above misleading comment. 99.99% of true "vibe coders" would NEVER have caught this.

Based on Gas Town, the people doing this agree that they are well beyond an amount of code they can review and comprehend. The difference seems to be they have decided on a system that makes it not a terrible idea in their minds.

> running these tools can review and comprehend the output properly

You have to realize this is targeting manager and team lead types who already mostly ignore the details and quality frankly. "Just get it done" basically.

That's fine for some companies looking for market fit or whatever - and a disaster for some other companies now or in future, just like outsourcing and subcontracting can be.

My personal take is: speed of development usually doesn't make that big a difference for real companies. Hurry up and wait, etc.