Comment by Oras

10 hours ago

Also pricing, I wanted to give a try, but when pricing is only 30% cheaper than Opus, I wouldn't go for it with these issues.

8 comments

Oras

nijave 3 hours ago

z.ai coding plan is a fairly decent deal at ~$16/mon USD considering it's supposed to have a fair bit more usage than the comparable $20/mon Claude plan. On the other hand, z.ai seems a bit on the slower side for raw model tok/sec throughput.

chpatrick 9 hours ago

It's pricing is a lot cheaper if you can run it yourself.

nijave 3 hours ago

Not this one. It's a SOTA-class model >800Gi VRAM required at fp8

jeremyjh 8 hours ago

What?

It is less than 20% of the cost of Opus at API rates. 1.40/4.40 vs 5/25.

cmrdporcupine 6 hours ago
Not when you factory in token efficiency. It burns a lot more tokens to do the same job, so when I compared to GPT5.5 I was frankly not really much ahead, and with weaker thinking.
Maybe makes sense if you have z.AI's (not greatly priced) subscription plan, but it's not competitive against an OpenAI or Anthropic monthly coding subscription plan. I burned through almost $10 worth of tokens just doing an hour of work.
- Sanzig 5 hours ago
  
  Take a look at Ollama Cloud: https://ollama.com/pricing
  You get access to a whole bunch of bleeding edge open models including GLM-5.2, Kimi K2.7, DeepSeek 4 Pro, etc. Inference is run on US/SG/EU cloud providers with zero data retention policies. The $20/mo tier is very generous, in my experience.
  
  2 replies →