Comment by joystick_0x0

4 hours ago

I am not sure what I am doing wrong then. I am using claude the last 7 months and from time to time try other models like deepseek, kimi etc. Nothing can come even close to it. Claude is almost evrytime (99.99%) one shot.

3 comments

joystick_0x0

InsideOutSanta 3 hours ago

In my experience, there is a very specific use case of one-shotting complex, long tasks with relatively vague or incomplete descriptions where Opus does substantially better than all other models I've tried, including GPT 5.5, GLM 5.1 and DS4. It seems to be better at inferring unstated requirements and creating a complete, working, reasonably well-designed solution.

However, that's probably not how most professional developers use LLMs. I tend to give well-specified, more constrained tasks, and for those, I find that Opus performs worse than other models precisely because it tends to infer unstated requirements and do things I didn't want it to do. In this situation, GPT 5.5 works better for me because it only and precisely does what I ask it to.

skerit 3 hours ago

Same here. Claude isn't perfect. It still makes a lot of mistakes. But whenever I try GPT-5.5 it's ten times worse, and Claude just has to clean up GPT's mess.

OtomotO 4 hours ago

You're obviously not doing anything wrong if it works for you.

It worked for me too, for months, when I was working on trivial web projects.

Around February of this year it got lobotomized and I quit my subscription end of march.

I am not going back.