Comment by eithed

3 hours ago

I ask Claude to fix given test:

- runs the test what is failing | grep "x|failing" | tail 10

- runs the test again to get the why it's failing message | tail 10

- runs the test again because tail 10 cut off the message

every time. What developers do things like this?!

I have a skill for it to not do that = save output for whatever test you run into file, read from file using whatever commands you want. Ignores the skill.

Same for debugging - something is failing. Instead of debugging given issue to see why it's failing, looking at the results it will look at the code trying to deduce why it's failing. First trace it finds that looks suspicious? "THAT'S IT, I FOUND IT. But let me reconsider." and after 15m it produces summary that is wrong. Put a debug point, look at it, then make your decisions. You have a skill to use for debugging that is phrased to do exactly that! No. I've never seen a human do things like this either.

It's maddening. It's as if, puts on tinfoil hat, it's designed to waste your tokens, while eventually accomplishing its task.