Comment by weinzierl

10 hours ago

"its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"

Hmm, I agree with the point OP is making, but I'm not so sure this is the best supporting argument. The bottleneck is finding the bugs and if he'd criticized people saying AI will be the panacea to that I'd be with him, but people saying agents are fast and good at fixing human found bugs is nothing I'd object to.

Agents are fixing bugs so quickly and at a scale humans can't do already.

9 comments

weinzierl

lolc 10 hours ago

> Agents are fixing bugs so quickly and at a scale humans can't do already.

The metric is how many defects are introduced per defect fixed. Being fast is bad if this ratio is above one.

babarock 10 hours ago

The tweet is criticizing over-reliance on the "agents will fix it anyway".

The fact that we can fix things faster now doesn't mean that we should throw away caution and prevention. The specific point of his tweet is that we're seeing a lot of people starting to skip proper release engineering.

Agents are quick to fix bugs, yes, but it doesn't mean that users will tolerate software that gets completely broken after each new feature is introduced and takes a certain number of days to heal each time.

zamalek 3 hours ago

> Agents are fixing bugs so quickly and at a scale humans can't do already.

This is an illusion, I assure you. On a side project of mine with behavior that's very hard to translate into an algorithm (never mind code), after a few failed attempts between the both of us, I figured it out. I gave the AI (Opus) an extremely specific algorithm with detailed tests. All completely and utterly ignored (including the tests), like I never even said it. It proudly declared the work done without ever having written the tests that would have proved that wrong - it basically wrote code that didn't change behavior at all, it just gave the illusion of looking busy.

That's just a single extreme example that comes to mind, but I've had it ignore me at least 4-5 times a day this week.

If you think agents are fixing things reliably then you simply haven't noticed that they are "looking busy."

woeirua 10 hours ago

You got downvoted for speaking the truth. HN has a strong anti-AI contingent. They won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this codebase”. We’re not there yet, but soon we will be. Then what?

maxbond 10 hours ago
More likely people thought GP was missing the point; "MTTR-optimized YOLO deployment" only succeeds against recoverable errors and acceptable periods of downtime against errors that are detected quickly. You could have a bug silently corrupting data for months, and that data may only be used by 1 critical process that runs once every quarter. So you could introduce a timebomb that can't be gracefully recovered from (depending on the nature of the data corruption).
So the point is not that agents cannot find bugs (they certainly can), it's whether you can shirk reviewing for bugs if MTTR is fast enough. There are circumstances where YOLO is appropriate, but they aren't the production environment of a mature application.
- weinzierl 9 hours ago
  
  I don't think I missed the point, that is why I said I agree with the general point (and with what you said in your comment).
  What I wanted to say is that the particular people that think "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" are not the best argument for it.
  But I won't die on this hill, maybe I'm just reading the sentence differently then others.
  
  1 reply →
hansmayer 10 hours ago

> won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this
But this is just holding the Slop Companies to the standard they declared themselves! Just recently, the CEO of OpenAI babbled some nonsense on twitter about how he hands over tasks to Codex who according to him, finishes them flawlessly while he is playing with his kid outside.
> but soon we will be.
Ah yes, in the 3-6 months, right? This time next year Rodney, we'll be millionaires!