Comment by joseda-hg

5 months ago

It (CC) does have a /models command, you can still decide to route everything to Opus if you just want to burn tokens I guess it's not default so most wouldn't, but still, people willing to go to a third party client are more likely that kind of power user anyway

They still have the total consumption under their control (*bar prompt caching and other specific optimizations) where in the past they even had different quotas per model, it shouldn't cost them more money, just be a worse/different service I guess

4 comments

joseda-hg

skeledrew 5 months ago

> it shouldn't cost them more money

As things are currently, better models mean bigger models that take more storage+RAM+CPU, or just spend more time processing a request. All this translates to higher costs, and may be mitigated by particular configs triggered by knowledge that a given client, providing particular guarantees, is on the other side.

joseda-hg 5 months ago
That’s kind of the point. Even if users can choose which model to use (and apparently the default is the largest one), they could still say (For roughly the same cost): your Opus quota is X, your Haiku quota is Y, go ham. We’ll throttle you when you hit the limit.
- skeledrew 5 months ago
  
  But they don't want the subscription to be quota'd like that. The API automatically does that though, as different models use different amounts of tokens when generating responses, and the billing is per token. And quite literally is having the user account for the actual costs of usage, which is the thing said users are trying to avoid, on their own terms, and getting upset about when they aren't.

ac29 5 months ago

> It (CC) does have a /models command, you can still decide to route everything to Opus if you just want to burn tokens I guess it's not default so most wouldn't

Opus is claude code's default model as of sometime recently (around Opus 4.6?)