Comment by Ainaguade

8 days ago

"This is exactly why we built AINAScan — we found that AI-generated code passes all tests and 'works', but consistently produces the same 15 structural bugs: save functions that never write to DB, async functions with no await, parameters that have zero effect on return values. Linters miss all of these. The code looks fine until production."

I've found they optimize largely for the happy path and don't consider any (or enough) edge cases (e.g. what happens to everything downstream of this, if say, there's a timeout in this specific HTTP request.)

I find Claude fighting me when I point out things like that as well claiming its not worth it to worry about and when I point out the train wreck that would ensue by leaving some things in a weird state it flips its script: "I was incorrect about that..."

Folks will claim that's a skill issue but, its an issue letting the LLM run off without any oversight and it creates so much code you can't possibly review it all yourself, so like you've found, you hit a lot of problems in production.

AGENTS.md and friends help but I've found it ignoring rules in their as well, often and very frequently.

Personally what I've found to be the biggest win is:

1. Use AI to go harder not faster - use it for code reviews, second opinions, researching all the angles to a deep topic, but don't use it to pump out a whole app. Don't outsource your entire understanding to it (which is precisely what we do at my job today...)

2. Use it for the boring tasks that are hard to get wrong and can be easily validated.

Oh and tests... I've seen so many completely useless tests being generated. Any valuable test that I've ever seen claude create came from me finding a bug or missed edgecase.