Comment by pyeri

1 day ago

Have you tried one of the Kimi K2 models or the latest GLM models by z.ai? The general consensus is that they're at least at par with Claude's class.

13 comments

pyeri

pimeys 1 day ago

They are but from our evals for example GLM 5.2 (unquantized) performs as well as Opus but uses more tokens and takes more time.

I really wish this would change soon but they are not there yet.

klardotsh 1 day ago

Using even double the total tokens and taking, what, 2-3x the time?, still seems worth it if prices are 5x+ cheaper (which OpenRouter [1] claims is the case).
On NeuralWatt for my personal projects at home (not affiliated, just a happy customer), I get so much more mileage out of GLM than I get out of Claude at work, specifically because it's priced as a hammer I can pound any nail-shaped-object with, not a delicacy I need to carefully budget-analyze to try to figure out if it's worth burning my monthly spend limits on this task.
https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...
Den_VR 1 day ago
I thought true token use was being hidden by anthropic and openai both
- vanviegen 1 day ago
  
  No, they do specify token counts, as they let you pay for them. They just don't tell you what these thinking tokens actually are.
  
  1 reply →

stavros 1 day ago

If K2 or GLM 5.2 are on par with Opus 4.8 I'll eat my hat. They're good, but they're not that good. Deepseek V4 Pro has been better than Sonnet for me, but the only model that comes close to or surpasses Opus 4.8 is GPT-5.5.

Aeolun 1 day ago

GLM 5.2 is far better than deepseek V4. Seriously feels like I’m talking to a Claude model. Also burns tokens like one, so there is that. Deepseek is unbeatable on price/quality.
fjsoxjdnwk 1 day ago
Honestly just give it time. This stuff moves so fast next month the conversation will be different. For folks who don’t like the ID privacy issues, use Deepseek et al and it should be able to get the job done even if the experience takes a bit more wrangling.
The problem with the ID verification is that they can pair introspective conversations with ID. Either that bothers people or it doesn’t.
Main point: we can’t fret about current state models because the ID verification has future implications. Models will change and competition will catch up. Do what feels right in the long run not whether TODAYS model is better at Anthropic.
- stavros 1 day ago
  
  I agree with this, my disagreement was strictly with saying that the current open models are as good as Opus.
  
  4 replies →