← Back to context

Comment by Robdel12

20 hours ago

Wow, bad enough for them to actually publish something and not cryptic tweets from employees.

Damage is done for me though. Even just one of these things (messing with adaptive thinking) is enough for me to not trust them anymore. And then their A/B testing this week on pricing.

The A/B testing is by far the most objectionable thing from them so far in my opinion, if only because of how terrible it would be for something like that to be standard for subscriptions. I'd argue that it's not even A/B testing of pricing but silently giving a subset of users an entirely different product than they signed up for; it would be like if 2% of Netflix customers had full-screen ads pop up and cover the videos randomly throughout a show. Historically the only thing stopping companies from extraordinarily user-hostile decisions has been public outcry, but limiting it to a small subset of users seems like it's intentionally designed to try to limit the PR consequences.

  • The best possible situation that I can imagine is that Anthropic just wanted to measure how much value does Claude Code have for Pro users and didn't mean to change the plan itself (so those users would get CC as a "bonus"), but that alone is already questionable to start with.

People come at this with all kinds of life experience. The above notion of trust to me is quaint and simplistic. I suggest another way to frame trust as a more open ended question:

    To what degree do I predict another person/org will give me what I need and why?

This shifts "trust" away from all or nothing and it gets me thinking about things like "what are the moving parts?" and "what are the incentives" and "what is my plan B?".

In my life experience, looking back, when I've found myself swinging from "high trust" to "low trust" the change was usually rooted in my expectations; it was usually rooted in me having a naive understanding of the world that was rudely shattered.

Will you force trust to be a bit? Or can you admit a probability distribution? Bits (true/false or yes/no or trust/don't trust) thrash wildly. Bayesians update incrementally: this is (a) more pleasant; (b) more correct; (c) more curious; (d) easier to compare notes with others.

so who do you trust and go to? (NotClearlySo)OpenAI?

  • I went with MiniMax. The token plans are over what I currently need, 4500 messages per 5h, 45000 messages per week for 40$. I can run multiple agents and they don't think for 5-10 minutes like Sonnet did. Also I can finally see the thinking process while Anthropic chose to hide it all from me.

    I'm using Zed and Claude Code as my harnesses.

  • I "subconsciously" moved to codex back in mid Feb from CC and it's been so freaking awesome. I don't think it's as good at UI, but man is it thorough and able to gather the right context to find solutions.

    I use "subconsciously" in quotes because I don't remember exactly why I did it, but it aligns with the degradation of their service so it feels like that probably has something to do with it even though I didn't realize it at the time.

    • Anthropic definitely takes the cake when it comes to UI related activities (pulling in and properly applying Figma elements, understanding UI related prompts and properly executing on it, etc), and I say this as a designer with a personal Codex subscription.

    • Codex does better if you ask it to take screenshots and critique its own UI work and iterate. It rarely one-shots something I like but it can get there in steps.

    • it's been frustrating how bad it is at UI. I'm starting to test out using their image2 for UI and then handing it to codex to build out the images into code and I'm impressed and relieved so far

    • Codex isn't great at UI, but you might find Gemini is competent enough as an adjunct. I've had some luck with that.

  • At the moment, yeah. If Google ever figures out how to build an agentic model, I would use them as well.

    However you feel about OpenAI, at least their harness is actually open source and they don’t send lawyers after oss projects like opencode

  • Anecdotally, I know many people who have supplemented Claude with Codex, and are experimenting with models such as GLM 5.1, Kimi, Qwen, etc.