Comment by root-parent

21 hours ago

Author seems strangely unwilling to distinguish usage from profitable product market fit. And from his own numbers:

Anthropic Max: $100/month

OpenAI Pro: $100/month

Total paid: $200/month

API equivalent usage: $2,180.16 in 30 days

So paid only 9.17% of API-priced value a 90.83% discount, or about $10.90 of API priced usage for every $1 paid...

That proves heavy usage but not sustainable unit economics.

Anthropic reported numbers point the same way:

Q2 revenue: $10.9B

Adjusted operating profit: $559M

Margin: 5.1%

SpaceX compute: $1.25B/month = $3.75B/quarter

So one compute supplier alone equals 34.4% of quarterly revenue and 6.7x quarterly adjusted operating profit.

Its difficult for the blogger to understand something when its incentives depend on not understanding it...

3 comments

root-parent

simonw 21 hours ago

My point with the $2,180.16 thing is that the price for consumers like myself is heavily discounted... but the price for enterprise companies is not discounted.

My usage is therefore a useful indicator of quite how much those enterprise companies may be spending on tokens, given the new pricing scheme.

If enterprise companies were still getting the same discounts that I get myself I would not have written this article.

(I had to dig into your margin figure - looks like you calculated 5.1% as 559000000 / 10900000000 * 100 but that $559M "adjusted operating profit" figure includes training costs, where usually when we talk about margin on inference we're not including those since those costs are fixed, margin calculations make more sense against the variable costs of serving a token.)

what 19 hours ago
When you have to train a new model every few months to stay competitive, discounting that cost is rather dubious.
- simonw 19 hours ago
  
  They key difference here is that training costs are fixed. If you train a model for $100m dollars, how much of that training fee should you allocate to each token that the model serves?
  It's impossible to know, because you don't know how many tokens total will be served by that model until you retire it at some point in the future.
  So you can't say "1,000,000 tokens costs $X in inference and $Y in training" because $Y is not possible to correctly calculate.
  So, if you want to have a productive conversation about "margin on inference", it's sensible to look at the cost of serving the tokens independently of the cost of training the underlying model.