Comment by y1n0

1 day ago

None of these companies have compute to spare. It’s not in their interest to use more tokens that necessary.

Sure it is. They're well aware their product is a money furnace and they'd have to charge users a few orders of magnitude more just to break even, which is obviously not an option. So all that's left is.. convince users to burn tokens harder, so graphs go up, so they can bamboozle more investors into keeping the ship afloat for a bit longer.

  • If this claim is true (inference is priced below cost), it makes little sense that there are tens of small inference providers on OpenRouter. Where are they getting their investor money? Is the bubble that big?

    Incidentally, the hardware they run is known as well. The claim should be easy to check.

    • To be clear, I'm talking about subscription pricing. API pricing for Anthropic is probably at-cost.

      I dare you to run CC on API pricing and see how much your usage actually costs.

      (We did this internally at work, that's where my "few orders of magnitude" comment above comes from)

  • It's an option and they are going to do it. Chinese models will be banned and the labs will happily go dollar for dollar in plan price increases. $20 plans won't go away, but usage limits and model access will drive people to $40-$60-$80 plans.

    At cell phone plan adoption levels, and cell phone plan costs, the labs are looking at 5-10yr ROI.

Not true - they absolutely want to goose demand as they continue to burn investor dollars and deploy infra at scale.

If that demand evens slows down in the slightest the whole bubble collapses.

Growth + Demand >> efficiency or $ spend at their current stage. Efficiency is a mature company/industry game.

That doesn’t mean they also can’t be wasteful. Fact is, Claude and gpt have way too much internal thinking about their system prompts than is needed. Every step they mention something around making sure they do xyz and not doing whatever. Why does it need to say things to itself like “great I have a plan now!” - that’s pure waste.

  • > Why does it need to say things to itself like “great I have a plan now!”

    How else would it know whether it has a plan now?

Are you saying these companies don't want to sell more product to us? Because that's the logical extension of your argument.

  • No, the argument is they want to sell more product to more people, not just more product (to the same people.) Given that a lot of their income is from flat-rate subscriptions, they make money with more people burning tokens rather than just burning more tokens.

    After all, "the first hit's free" model doesn't apply to repeat customers ;-)