Comment by gofreddygo
1 day ago
For months, Employees had the option to choose claude code or copilot. Now they dont.
Underlying model choice still has no restrictions. Opus 4.6 is by far the most popular. there's still big $$$ bills going anthropic's way.
Curious if anyone around here stayed on 4.6 (having a choice to use 4.7)
I went to 4.7, didn't have a choice, found it unsatisfactory, then Claude quietly added in the option to use 4.6, so I'm back on 4.6, and I'm not the only one in my company.
I had far more hallucinations with 4.7 than 4.6.
I'll try it again after a few more months for them to get it right, but 4.6 is what changed my mind on LLMs as a tool, and 4.7 felt like a step backwards, so for now I'm sticking with something that has delivered me value, instead of arguing with a model ostensibly better that was making shit up 1 - 2 times a day. It was really disappointing.
I can give examples if needed, I screenshotted the most aggravating ones, but what worries me is which ones I didn't recognise.
How did you manage to do that?
/model command returns only 4 choices for me: Opus 4.7, two Sonnet options and Haiku.
7 replies →
Opus 4.7 went through a major degradation a few weeks ago (way more hallucinations and rabbit holes than usual). Anthropic fixed it. Give it another shot.
2 replies →
Opus 4.7 seems very smart but the adaptive reasoning makes me always uncertain how hard it is actually trying. And it is far too argumentative. It seems to think it HAS to contradict you in ever response.
I have stuck with 4.6. I fully believe 4.7 can be smarter for truly complex and long running agentic use. But I prefer the more direct, literal mechanistic style and 4.6 seems to be peak Opus for that.
Stay with 4.6 if you can, it is disabled (afaik) on vscode claude code extension.
4.7 IMO is around 10-20% worse at understanding your prompt intention. You need more effort to explain your intention clearer so it doesn't divert.
Same. 4.7 intelligence is significantly worse than 4.6 on ALL 3P Harnesses. So only on Claude Code and Anthropic API/Subscription you get decent performance but on every other Harness and/or Cloud Provider inference (Bedrock) it performs worse than 4.6 on almost every task. This is not just anecdotal, i've talked to many colleagues from AWS, Microsoft and so on and they all agree that something fishy is going on.
2 replies →
I was recently talking to someone about that! I wasn't sure if it was my imagination, but I felt like Opus 4.6 was way more diligent about looking things up online and making sure that its response was accurate. While Opus 4.7 seems content to just throw out an answer as quickly as possible with little care for accuracy; I started to always remind it to do an online search and to double check its work, to the point where I had to add a custom memory.
I switched back to 4.6 thinking, as most did, 4.7 introduced some jankinesss to it. I switched back soon enough to 4.7. I think I might've adapted myself to what and how 4.7 does things. 4.6 felt a step backward.
5 replies →
4.7 turned out to be a disaster in multilingual settings, so I sticked to 4.6 so far. 4.7 seemed to be optimized for (very specific slice of) coding at the expense of everything else.
It also seems to be designed to optimising the design / planning phase of a typical programming project.
I still use 4.6 if I need Opus. It's mostly GPT-5.5 for me. Only if I know it cannot do some thing like push without running the tests (because AGENTS.md said so), I switch to 4.6.
Although GPT's been acting weird since Thursday...
I’ve stayed on 4.6. Was thinking of trying 4.7 though just today. Still, I did not jump on it day one.
Switched back when 4.7 had an issue last week and it was wayyy faster. I assume mostly because a lot of people have moved off but might consider using it more just for the speed boost.
I don't want to change from 4.6 because I'm finding it so good (I could change).
I've spent the last couple of days building Swift bindings to a monster CPP lib and I've actually had fun.
i use 4.6 and i've configured advisor to be on 4.7, so, when something's more complex the advisor can help. at least that's how i do with claude code, not sure of the others have implemented the concept of advisors.
Wouldn't they be forced into API pricing instead of per-seat like that though? That would potentially be a massive cost increase. But I've discovered through talking to colleagues some companies are already doing that. I can't understand why you'd ever do that when you can get VC subsidized pricing for now. At least for all initial in-plan usage. I doubt many developers go past the limit anyway and for those you switch just the extra usage to on demand anyway.
Teams is the only one with seat pricing. Teams has a user cap of 150. Enterprise is usage based pricing only now (with a £20/user service charge)
I use copilot cli and I can pick Anthropic models. The Microsoft interface seems fine to me, and equivalent. Not sure what the big deal is.
Funny I had the opposite experience. The Claude models seemed equivalent to GPT-5.4/5 in a generic harness like Copilot CLI or Opencode or Pi, but Claude Code the app/harness is so much better than all the others that I switched at work, even though I'd much prefer to use a non-proprietary harness (and eventually I do want to get Pi set up to be comparable).
Well, maybe I shouldn't have assumed the harness wasn't a big deal. When Microsoft changes its pricing on 6/1 I'll try some of the others.
Harness makes a difference. Also in copilot you have smaller context for Claude models.
And you get a token based pricing since June 1.
Anthropic's Claude harness is much better than Copilot, i.e. the tools and instructions in each harness are different. Anthropic is just that much better (for claude models, likely an amount of co-development).
Personally, I looked into Copilot's prompt and saw things that made me put it down immediately to start working on my own. I'm now using OpenCode for reasons and I like it better than any Big Ai tool. Using OC with Qwen3.6-MoE (for context) and generally happy with the results.