Comment by dudeinhawaii

16 days ago

Unpopular opinion but I prefer slow and correct.

My experience on Claude Max (still on it till end-of-month) has been frequent incomplete assignments and troubling decision making. I'll give you an example of each from yesterday.

1. Asked Claude to implement the features in a v2_features.md doc. It completed 8 of 10 but 3 incorrectly. I gave GPT-5.1-Codex-Max (high) the same tasks and it completed 10 of 10 but took perhaps 5-10x as long. Annoyingly, with LLM variability, I can't know for sure if I tried Claude again it would get it correct. The only thing I do know is that GTP-5.2 and 5.1 do a lot more "double-checking" both prior to executing and after.

2. I asked Claude to update a string being displayed in the UI of my app to display something else instead. The string is powered by a json config. Claude searched the code, somehow assumed it was being loaded by a db, did not find the json and opted to write code to overwrite whatever comes out of the 'db' (incorrect) to be what I asked for. This is... not desired behavior and the source of a category of hidden bugs that Claude has created in the past (other models do this as well but less often). Max took its time, found the source json file, and made the update in the correct place.

I can only "sit back and let an agent code" if I trust that it'll do the work right. I don't need it fast, I need it done right. It's already saving me hours where I can do other things in parallel. So, I don't get this argument.

That said, I have a Claude Max and OpenAI Pro subscription and use them both. I instead typically have Claude Opus work on UI and areas where I can visually confirm logic quickly (usually) and Codex in back-end code.

I often wonder how much the complexity of codebases affects how people discuss these models.

0 comments

dudeinhawaii

No comments yet

Contribute on Hacker News ↗