Comment by serial_dev

8 hours ago

You never get "the same" Steph Curry, he might be tired, annoyed by a fan, getting older... but if he and I were to throw 100 3-pointers, we could all correctly guess who will perform better.

Good point.

But I use Codex and Claude daily (work and hobby respectively). And there are days where one or the other just seems to have gotten up on the wrong side of the bed. Or is just being lazy. Or is suddenly super-powered do everything including what i asked it not to. (To be fair, the same thing happens with myself. :/)

I am convinced that if I was bench-marking, I would be convinced these are different models on different days.

[This conviction may say more about me then about the model.]

  • That's also fair, Anthropic lobotomized their services a couple of times already. One week, you are in awe that the tools figure out everything, explain everything, consider everything, produce a clean fix... next week, they are completely useless.