Comment by KronisLV

5 hours ago

Used them for a while! They didn't seem to have prompt caching so I burnt through the daily 24M token limitations really quickly when doing large scale changes on a codebase (essentially a team's worth of menial migration/refactoring work). A lot of it was okay, but plenty had to be re-done and I still spotted some issues months down the line, in part I blame their model catalogue which did get an update to GLM 4.7 sometime way back, but definitely is showing its age: https://inference-docs.cerebras.ai/models/overview

Quality wise, Anthropic gives me the best results (Opus for almost everything, I make sub-agents with fresh context review its work, after 2-10 loops, usually finds most issues). Token amount wise for agentic work, DeepSeek V4 is up there. What Cerebras is doing pretty cool though, apparently they even have prompt caching now like the other big providers: https://inference-docs.cerebras.ai/capabilities/prompt-cachi... At the same time, producing bad code faster was annoying in a uniquely new way.

Wish they'd update the models with their subscription, it could genuinely be great with the proper harness. Like if they can run GLM 4.7, surely they could at least get DeepSeek V4 Flash with a big context window going as a starting point. How can you have so much money to make your own chips, but can't run modern models that you can get for free? It's like they don't want people to use their subscription.

2 comments

KronisLV

cactusplant7374 4 hours ago

Have you tried Codex? If you have, how does it compare to Opus?

KronisLV 3 hours ago

Codex is pretty good, OpenAI models are up there with Anthropic's, though I still prefer the latter for most development tasks (in part UI/UX, in part personal preference for how the model performs and interacts with me and the codebases). That said, if you do get a subscription from OpenAI, they actually have more generous usage limits than Anthropic - Anthropic's Pro tier is borderline useless for agentic development and I just went with their 100 USD Max tier instead. OpenAI might be more cost effective, though GPT-5.5 is more expensive than GPT-5.4, for example.
I'm recently also considering downgrading to Pro and using DeepSeek V4 Pro for anything but the more complex tasks and basically wrote a little utility to hook Claude Code up with 3rd party providers better: https://ccode.kronis.dev/ or tbh I could also just use OpenCode on the CLI or maybe something like KiloCode in Visual Studio Code (sadly RooCode got retired, liked their UI/UX a lot too).
I guess where I'm going with all this is that most of the SOTA or near-SOTA models are pretty okay and if you want, you should either get their more affordable plans for a month and experiment, or maybe hook up whatever tools you have with something like OpenRouter and try out a bunch of them: https://openrouter.ai/ (though some of their providers quantize the models a lot, look out for that) Personally I'd also add the new Kimi and GLM models to the list of the ones to try out.
Paying for API tokens isn't really financially good long term for anyone but companies and eventually most folks just settle on a subscription of some sort, since those are heavily subsidized and more cost effective.