Comment by alexjplant

8 hours ago

For personal use I've noticed Claude (via the web-based chat UI) making really bizarre mistakes lately like ignoring input or making completely random assumptions. At work Claude Code has turned into an absolute dog. It fails to follow instructions and builds stuff like a lazy junior developer without any architecture, tests, or verification. This is even with max effort, Opus 4.6, multiple agents, early compaction, etc. I don't know what they did but Anthropic's quality lead has basically evaporated for me. I hope they fix it because I've since adapted my project's Claude artifacts for use with Codex and started using it instead - it feels like Claude Code did earlier this year.

I'd like to give the new GLM models a try for personal stuff.

3 comments

alexjplant

selcuka 1 hour ago

> At work Claude Code has turned into an absolute dog.

Could it be related to this?: https://news.ycombinator.com/item?id=47660925

AussieWog93 4 hours ago

I've noticed the same thing, and even done side by side tests where I compare Claude Code with Cursor both running Opus 4.6.

It seems Cursor somehow builds a better contextual description of the workspace, so the model knows what I'm actually trying to achieve.

The problem is that with Cursor I'm paying per-token, so as GP suggested you can easily spend $100+ per month vs $20 on Claude Code.

stldev 4 hours ago

Same, I'm looking hard for an alternative to what I had.

And I'm seeing the same thing in my sphere- everyone is bailing Anthropic the past few weeks. I figure that's why we're seeing more posts like this.

I hope they're paying attention.