← Back to context

Comment by sally_glance

6 hours ago

They absolutely can do that if you give them the tools. Seeing Claude (I use it with opencode agents) run curl and playwright to verify and then fix it's implementation was a real 'wow' moment for me.

We have different experiences. Often I’ll see Claude, et. al. find creative ways to fulfill the task without satisfying my intent, e.g., changing the implementation plan I specifically asked for, changing tolerances or even tests, and frequently disabling tests.

  • I see these “you had a different experience than me” comments around AI coding agents a lot and can concur; I’ll have a different experience with Copilot from day-to-day even, sometimes it’s great and other days I give up on using it at all it’s being so bad.

    Makes me honestly wonder — will AGI just give us agents that get into bad moods and not want to work for the day because they’re tired or just don’t feel like it!