← Back to context

Comment by esperent

17 hours ago

That's why you have them write tons of tests. Way more than you generally would for human written code. And the agent writing/maintaining the tests is not the agent fixing the bugs.

I've personally had a LLM write an image resizing library for me. It's a fairly basic one, I didn't need anything fancy. I could have used something off the shelf but it was at a time when I was testing what Claude could do. And to be honest, it just worked. One shot, if I recall correctly, or at least, one session with a few tweaks and never touched again. It's been embedded in a larger app for several months and I don't recall hitting a single bug with that, specifically. So I'm not sure your complaints about "the 5th iteration" being broken have much grounds here.

> It's a fairly basic one, I didn't need anything fancy.

> one session with a few tweaks and never touched again

> and I don't recall hitting a single bug with that, specifically.

And there you got your answer. If every scenario was as simple as that, we wouldn't really need software development teams. I'm not saying that you can't good result with an LLM tool, but most software are in constant flux and software engineering is about keeping the cost of making new changes minimal.

So if you have a dependency, you want to treat it as a black box, because it lowers the cognitive load. But you don't want it to suddenly change its contract, including breaking it in some strange way. And that brings me to...

> That's why you have them write tons of tests.

Tests are not implementation guarantee. They are a canary to warn about some errors. You assume the code is going to written in good faith, but you place alert points to warn you about possible mistakes. Because you can't really test the full implementation without having a brittle test suite (which you have to maintain).

And tests relies on a lot of assumptions (mocks, initial cases, fakes,...). Those should be treated with care. Because as soon as one are wrong, the test cases it affects are make-believe.

The only true testing of your software is done in production. Everything else is about avoiding the easy mistakes.