Comment by aliljet

13 hours ago

This is the rugpull that is starting to push me to reconsider my use of Claude subscriptions. The "free ride" part of this being funded as a loss leader is coming to a close. While we break away from Claude, my hope is that I can continue to send simple problems to very smart local llms (qwen 3.6, I see you) and reserve Claude for purely extreme problems appropriate for it's extreme price.

> This is the rugpull that is starting to push me to reconsider my use of Claude subscriptions.

I'm still with them cause the model is good, but yes, I'm noticing my limits burning up somewhat faster on the 100 USD tier, I bet the 20 USD tier is even more useless.

I wouldn't call it a rugpull, since it seems like there might be good technical reasons for the change, but at the same time we won't know for sure if they won't COMMUNICATE that to us. I feel like what's missing is a technical blog post that tells uz more about the change and the tokenizer, although I fear that this won't be done due to wanting to keep "trade secrets" or whatever (the unfortunate consequence of which is making the community feel like they're being rugpulled).

  • 20 USD tier was useless from the start. You'd get to the limit in 30 minutes. Codex with 20 USD on the other hand...

    • Give Codex a month or two and it'll just do the same thing, though now with a million more users because Claude Code wasn't good enough.

    • OpenAI has been doing the same thing gradually. Codex launched with generous Plus limits, then they introduced the $100 Pro tier, and Plus limits have quietly tightened since. With the same repetitive tasks I was running, consumption is noticeably higher now for the same output.

      The pattern feels deliberate — make the $20 tier just uncomfortable enough that power users upgrade, without officially announcing the reduction. If it continues, $20 buys you a demo and $100 buys you actual work.

I think an LLM that is a decent chunk smarter/better than other LLM's ought to be able to charge a premium perhaps 10x or 100x it's competitors.

See for example the price difference between taking a taxi and taking the bus, or between hiring a real lawyer Vs your friend at the bar who will give his uninformed opinion for a beer.

Quality of answers from quantized models is noticeable worse than using the full model.

You'll be better using Qwen 3.6 Plus through Alibaba coding plan.

  • > Quality of answers from quantized models is noticeable worse than using the full model.

    This is the very reason I've heard I shouldn't use Alibaba!