← Back to context

Comment by muddi900

18 hours ago

z.ai will use quantized models in off hours. Buyer beware

This... doesn't make sense. Why would they use a quantized model when load is low and the full model when load is high???

I have a subscription and I have not seen any difference in performance during on/off hours. What exactly are you basing this on?

Do you have proof for this?

  • No they don't it's just a smear campaign because the US tech companys are freaking out

    • There are similar unfounded complaints about US companies. It's just what happens with a product that doesn't work 100% of the time even under ideal conditions. Some people are bound to get unlucky but blame it on deliberate action rather than random chance.

I hear a lot of people complaining, I am on their Max plan, I never hit limits, use it non-stop and overall it has been fantastic experience.

  • Same feeling here on the Pro plan. I’m still in the old plan without the Weekly quotas, but never never exhausted the 5H so far.

  • Has 5.1 reliability improved? I would love to use it again. The inference was just too unreliable when it was first released.

I was one of the people just absolutely in misery when the GLM-5.1 model dropped. It wasn't quantized, I don't think, but it had some very gnarly issues where it would hit a context size, then seemingly try to quantize, and fall apart. It was unusable. It went from being an excellent model all the way to 200k, to being only 60k before it couldn't write in sentances and definitely couldn't tool call, to being 100k, to 120k. It was terrible, and I was so sad they had made my subscription so much worse, it felt like. https://news.ycombinator.com/item?id=47677853

But very shortly after this submission/release of 5.1, after a mass pouring out of sadnesses, they fixed it. Things have been back to absolutely amazing. I joined right before 4.7, and 4.7 was incredible. 5.0 was fantastic. 5.1 has been a dream. GPT still catches a lot of stuff and is smarter, but man, GLM-5.1 is so capable, and it's frankly often a better writer, often better understands and captures purpose and notion, where-as GPT often feels dry and focused on narrow technicals. I really appreciate GLM-5.1.

And I'm really glad Z.ai fixed the absurd damage they had in their systems. I do suspect they were trying to dynamically quantize as the context window grew, or some such trickery. It was not working at all, but somehow it tooks months to fix.