Comment by my002

19 hours ago

The era of subsidised inference is truly ending. The new model multipliers (https://docs.github.com/en/copilot/reference/copilot-billing...) seem like a huge leap, though. From 1x to 6x for new-ish GPT and Sonnet models. 27x for Opus...

Seems like folks would be better off with OpenRouter instead.

Lots of us have noticed that usage limits for Claude have been nerfed in recent weeks/months.

If anything, these new multipliers are more transparent than anything OpenAI or Anthropic have communicated regarding actual costs and give us a more realistic understanding of what it's costing these providers.

The fact that we were able to get such a substantial amount of usage for $20/$100/$200 a month was never meant to last and to think otherwise was perhaps a bit naive.

This feels like a strategy from the ZIRP era of tech growth where companies burned investor capital and gave away their products and services for free (or subsidized them heavily) in order to prioritize user acquisition initially. Then once they'd gained enough traction and stickiness they'd then implement a monetization strategy to capitalize on said user base.

  • However, inference costs for entirely good enough models are likely to keep declining in the future. We're probably hitting diminishing returns on model size and training. The new generations aren't quantum leaps anymore, and newer generations of open source models like DeepSeek are likely to start getting good enough.

    There's going to be a limit to how much they can raise prices, because someone can always build out a datacenter and fill it up with open source DeepSeek inference and undercut your prices by 10x while still making a very good ROI--and that's a business model right there. Right now I'm sure there's a lot of people who will protest that they couldn't do their jobs with lesser models, but as time goes on that will get less and less. Already right now the consumers who are using AI for writing presentations, cooking recipe generation and ELI5 answers for common things, aren't going to be missing much from a lesser model. That'll actually only start to get cheaper over time.

    Also for business needs, as AI inference costs escalate there comes a point where businesses rediscover human intelligence again, and start hiring/training people to do more work to use lesser models--if that is more productive in the end than shelling out large amounts of cash for inference on the latest models. [Although given how much companies waste on AWS, there's a lot of tolerance for overspending in corporations...]

    • > because someone can always build out a datacenter and fill it up with open source DeepSeek inference and undercut your prices by 10x while still making a very good ROI-

      Not sure how it all works out. Currently trillion dollar companies can't make a native app for platforms. Everything is just JS/Electron because economics does not work for them.

      And here companies can make GW data center running very expensive GPUs for 1/10th of current prices. Sound little fanciful to me.

      1 reply →

    • I think so too.

      And at some point even frontier model costs will hopefully come down (if there is still a meaningful difference between closed and open source models at that point) as all of the compute that's being built out right now comes online.

  • Dunno, if in this day and age you are making inference more expensive, more scarce, you are honestly moving in the wrong direction and DeepSeek and others will gladly take your lunch.

  • Did anyone really expect AI to be cheap?

    If/when it gets to the point where it can replace a skilled worker, the service can be sold for close to the same price as that skilled labour. But the AI can run 24/7, reliably, and scale up/down at a moments notice.

    There's not going to be much competition to drive prices down, the barriers to entry are already huge. There'll likely to be one clear winner, becoming a near-monopoly, or maybe we'll get a duopoly at best.

    • > Did anyone really expect AI to be cheap?

      Yes, a lot of people (not me). Why? Well because that was the whole value proposition of these companies, relentlessly pushed by their PR and most of the media- rememmber it was something something Pocket PhDs, massive unemployment etc?

    • "There's not going to be much competition to drive prices down, the barriers to entry are already huge. There'll likely to be one clear winner, becoming a near-monopoly, or maybe we'll get a duopoly at best."

      Based on what exactly? So far every time OpenAI, Anthropic or whatever has released a new top performing model, competitors have caught up quickly. Open source models have greatly improved as well.

      I expect AI to be just like cloud computing in general - AWS, Azure, GCP being the main providers, with dozens of smaller competitors offering similar services as well.

      1 reply →

    • I do. "Commoditize your complement". Want to sell lots of silicon? Give away good local models to run on that silicon.

      Even if SOTA models in the cloud are a few percentage points better, most work can be routed to local models most of the time. That leaves the cloud providers fighting over the most computationally intensive tasks. In the long term, I think models are going to be local-first.

      (Unless providers can figure out a network effect that local models can't replicate).

      9 replies →

    • > Did anyone really expect AI to be cheap?

      Considering most of the cost of producing a model is the upfront cost rather than the running one, I kinda still do.

      The point was never to produce 4 frontier models per company a year.

"This change aligns Copilot pricing with actual usage and is an important step toward a sustainable, reliable Copilot business and experience for all users."

I see statements like this as strong indicators that the sales people are wrapping up their work and the accountants are taking over. The land rush is switching to an operational efficiency play.

Yeah, totally. The recent pricing changes have just made my Copilot subscription go from great deal to awful value over night.

I've been wanting to get off MS more generally and this is good motivation. Will be playing round with OR this week.

  • Just be aware OpenRouter charges a 5.5% fee, I didn’t know until recently. I like the product, and I think the fee is fair, but if you want the absolute best pricing then go direct.

    • But with open router you can always just use the latest model. If you're committed to eg Claude opus then you're better off going directly to anthropic for sure, but if not, varying other models may be fine too, depending on use case and be massively cheaper. Eg new deep seek model with same mio context window or Kimi k2.6 with 270k context window for subagents which implement

      2 replies →

    • Or you could use gcp Vertex or aws Bedrock and still have access to a bunch of FMs without a markup.

  • I will not be renewing/switching over, either.

    I had copilot mainly so I could write issues and throw agents at it, while I went off and did other things. Has been great for contained spot work.

    At this point, I'll go ahead and leave it expire, and then consolidate between Codex and JetBrains AI. Especially since Xcode supports Codex with a first-party integration.

Even Sonnet 4.6 is 9x multiplier (previously 1x)!

The only model I even used on Copilot was Sonnet and now its got a ridiculous multiplier.

At this point they might as well just charge per Million tokens like every other provider instead of having a subscription.

  • They do for any new plan. Those multipliers are only for people that paid annually. After their subscription ends they'll go into token based pricing like the rest of people.

  • I understand it like : the 10 usd is for handling the business record, maybe also the harness, I get a few coins to kick tires, but to use it for anything real it’s pay as you go by the tokens list price.

  • > At this point they might as well just charge per Million tokens like every other provider instead of having a subscription.

    Pretty sure that's what they will eventually do

27x for Opus is genuinely shocking. at that point you're not paying for convenience anymore, you're just paying a GitHub tax. OpenRouter or direct API makes way more sense unless you're really glued to the IDE integration.

  • I keep seeing people mention OpenRouter.

    Does it effectively bypass regional restrictions for you, so you can use something like the Claude API from unsupported regions such as Hong Kong, or does it still enforce the official providers' geo-restrictions?

    • OpenRouter is great for budget control, but as they are indirect APIs, your experience with cached tokens may vary, eventually costing much more than in direct depending on the providers.

      You can pay with crypto though, which seems to be convenient for people under sanctions or with limited access, or if you are in low-tax jurisdiction (e.g. HK)

      1 reply →

Wow, having a corp. account I do wonder WHEN we are getting some kind of resctriction of usage, or require us to justify our usage.

That GPT4-mini change is going to be brutal! Its much better than 5-mini, which was itself much better than earlier free models.

The point of this loss leading is to properly hoover up the money in the pockets of enterprise customers, get them locked into the idea that they need the latest and greatest cloud-based model, while simultaneously starving everyone of the memory they'd need in order to run competent models locally.

In not-too-distant future we're going to be running better models on our phones than we can buy access to today in the cloud. Skate where the puck is going: soak the customers until that day comes.

It's interesting that the cost multiplier for Claude Sonnet 4/4.5/4.6 varies so much (1/6/9), while the API cost is exactly the same for all three models.

Also, the multiplier of 27 for Claude Opus 4.6/4. is way higher than the increase in API price would suggest.

I wonder why that is.

  • On GitHub copilot you pay per prompt. More powerful models can do a lot more work (consuming a lot more tokens) per prompt. Also, they tend to use more thinking tokens.

    • > More powerful models can do a lot more work (consuming a lot more tokens) per prompt.

      That is not my experience. Each model since at least GPT-4 can fill up an entire context window. In fact, more powerful models can solve tasks faster, so their ratio of multiplier to API price should decrease, not increase.

      For example, Claude Sonnet 4.6 has a multiplier of 9 and an API price of $15, which is 0.6 multiplier per dollar.

      Claude Opus 4.7 has an API price of $25, so it should have a multiplier of 25 * 0.6 = 15 when extrapolating from Sonnet, but the multiplier is 27.

      > Also, they tend to use more thinking tokens.

      That might be it. Is there any data on this somewhere?

Those multiplier are only for grandfathered Pro an Pro+ plans that had annual billing, basically a way to scare people of out of those plans. Ant new ones (and bussiness+enterprise plans) will be on token based billing since June 1.

One theory of the play of SpaceX might do if everyone migrates to query-based billing:

Provide cheap and unlimited access to Grok for programmers (hence the Cursor partnership/purchase for distribution).

-> This would drag massive revenue right before the IPO announcement, like if the company is super growing

-> At a loss, but don't worry, we need these funds to build the biggest datacenter of the universe.

This announcement would create enough momentum to increase valuation, and because of the merge of his companies, would save his X/Twitter investors from a tragedy.

-> Would also be a great service to Cursor investors and so, who are stuck with their VSCode fork

Can't wait for people to migrate to open tools (opencode/openrouter). This will unlock a lot of innovation.

(I know openrouter is not open, but it allows competition and should be easily replaceable if needed)

Show HN timing matters more than people think. Monday-Thursday, 9-11am Pacific, is when the front page has the most engaged readers. Weekend posts get less competition but also less engagement.

Why would folks be better paying 5.5% fee to OpenRouter ("Open") if most people just use one or two providers? Just use the provider's API.

  • The routing automatically routes you to other inference providers (for the same model) if/when the original provider goes down.

    It's a convenience cost, for sure, but it's not valueless in a fast-moving world. Certainly if you're comfortable with one provider and it's cheaper, do that.

  • For me the largest value-add is the unified API. Being able to instantly start trialling a new model with zero code changes is well worth 5%. The other part is not having to deal with billing for multiple platforms.

What's annoying is that it's obvious. In the case of GPT 5.5, if Copilot is going to charge 7.5x what GPT 5.4 costs while OpenAI themselves via the API/Codex only charges 2x of what GPT 5.4 costs, that will immediately raise an eyebrow.

  • To anybody who's been watching the tech sector with a critical eye for pretty much any period from the late 90s and onward, this is just the enshittification process. For most of OpenAI's existence it's been obvious, to me, that investors were burning insane levels of capital to build the market, and now that folks are locked in, you're seeing higher fees, ads, etc. Yet again, the user is the product; the investors want to siphon your data, attention and once you're hooked, money. And for companies like Microsoft and Apple, those hooks can dig deep.

    • Let's call it for what it is dumping. Dumping things on market below cost of production. This should not have ever been allowed. RnD costs I can accept somehow. But in this case the interference should have always been billed for the real costs that it took to produce and pay off the capex.

    • “Enshitification” is just when unsustainable subsidies end?

      Another reason to hate that word.

      From a different perspective, you were granted an incredible gift from the companies who let you use their product on their dime. Hopefully you made the most of it when you had the opportunity.

      2 replies →

Everyone seems to believe OpenRouter isn't subsidizing but, until they publish audited financials, I personally doubt it.

  • OpenRouter doesn't even have hardware. What are they possibly subsidizing? The platform costs?

    OpenRouter is guaranteed to be about the highest margin operator in the business right now. Everyone wishes they'd be them, skimming 5% off as the middleman without any OpEx.

    • > OpenRouter is guaranteed to be about the highest margin operator in the business right now. Everyone wishes they'd be them, skimming 5% off as the middleman without any OpEx.

      The 5% fee probably has to factor in Stripe's fees, which would be around 3% to 4% depending on whether it's an international card.

    • Streaming, caching, and tool calling can get pretty expensive with scale, even when you don't touch inference. Maybe they're doing something clever and are quite profitable.. or maybe they've already taken $40mm from VCs and are currently trying to raise $120mm at a 1.3B evaluation.

      They also show headline prices for the cheapest provider of whatever model, but then need to hit different backends some of which may be more expensive. For now they absorb those costs, but the VCs always come knocking.

      Just my opinion though. Totally agreed that they have one of the best positions amongst all AI providers from a financial standpoint.

      2 replies →

FYI, these are the multipliers for annual plan. I would hazard a guess most people are not on an annual plan

  • I am and I see it as stopping the music at a party when you want everyone to go home without telling them to go home. There is also the offer to quit with prorated refund for the remaining time. I think I am going to take it.

I don't know if it's just me but copilot kind of sucks. I've been running local models with like 9b parameters and they are about as good if not better. Obviously there's no integrations or whatever and I get most people are probably paying for that than anything else but eh. Big no thanks from me.

That's so unfair to us hard working developers. A month ago i could buy for .4$ a turn with Sonnet. Now i have to pay at least .9$ for this turn. Weeks ago i could buy for .12$ an Opus turn after they already raised prices and now they want .27$ from me for the same product! They are stealing from us!

  • They aren't stealing from us, for several reasons. First of all, it's a voluntary transaction. If you don't like the prices, use something else. Or don't use AI at all.

    Second, you have no idea what their costs are. It is most likely that they are simply passing on their costs to you. If that was not the setup, users would just go to another service provider who was providing tokens at a cheaper rate. It's not like there is a dearth of competitors in this business.

  • The already stole when they trained their models on the data.

    Now they just increase the price to buy it back