Comment by _pdp_
1 day ago
I've asked copilot (Claude Sonnet 4) to edit some specific parts of a project. It removed the lines that specifically have comments that say "do not remove" with long explanation why. Then it went ahead and modified the unit tests to ensure 100% coverage.
Using coding agent is great btw, but at least learn how to double check their work cuz they are also quite terrible.
This is the tricky part. The whole point of agents is, well, do things so that we don't have to. But if you need to check everything they do, you might as well copy and paste from a chat interface...
Which makes me feel early adopters pay with their time. I'm pretty sure the agents will be much better with time, but this time is not exactly now, with endless dances around their existing limitations. Claude Code is fun to experiment with but to use it in production I'd give it another couple of years (assuming they will focus on code stability ans reducing its natural optimism as it happily reports "Phase 2.1.1 has been successfully with some minor errors with API tests failing only 54.3% of the time").
Claude loves to delete comments. I setup specific instructions telling it not to, and yet it regularly tries to delete comments that often have nothing to do with the code we're working on.
It's so hit and miss in Rust too. When I ask it for help with a bug it usually tries a few things then tries to just delete or comment out the buggy code. Another thing it does is to replace the buggy code with a manual return statement with a comment saying "Returning a manual response for now". It'll then do a cargo build, proclaim that there are no errors and call it a day. If you don't check what it's doing it would appear it has fixed the bug.
When I give it very specific instructions for implementation it regularly adds static code with comments like "this is where the functionality for X will be implemented. We'll use X for now". It does a cargo build then announces all of its achievements with a bunch of emojis despite having not implemented any of the logic that I asked it to.