← Back to context

Comment by sally_glance

8 hours ago

They absolutely can do that if you give them the tools. Seeing Claude (I use it with opencode agents) run curl and playwright to verify and then fix it's implementation was a real 'wow' moment for me.

We have different experiences. Often I’ll see Claude, et. al. find creative ways to fulfill the task without satisfying my intent, e.g., changing the implementation plan I specifically asked for, changing tolerances or even tests, and frequently disabling tests.

  • I see these “you had a different experience than me” comments around AI coding agents a lot and can concur; I’ll have a different experience with Copilot from day-to-day even, sometimes it’s great and other days I give up on using it at all it’s being so bad.

    Makes me honestly wonder — will AGI just give us agents that get into bad moods and not want to work for the day because they’re tired or just don’t feel like it!

    • If part of the goal is to emulate a person's abilities, then surely that includes a person's ability to fuck things up.

  • Are you a customer?

    • Don’t downvote because you don’t like the question.

      It obviously adds to the discussion: paid and non paid accounts are being conflated daily in threads like these!

      They’re not the same tier account!

      Free users, especially ones deemed less interesting to learn from for the future, are given table-scraps when they feel it’s necessary for load reasons.

      1 reply →