Comment by ethin
21 hours ago
> If you just do the math yourself, it's easy to compute that inference doesn't cost all that much.
Show us your work, then. If it's so easy to do, this should be a trivial request to accommodate, no?
21 hours ago
> If you just do the math yourself, it's easy to compute that inference doesn't cost all that much.
Show us your work, then. If it's so easy to do, this should be a trivial request to accommodate, no?
Just look at large open weights models being served by inference providers.
Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet. Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.
I'm not sure just how good that looks for Anthropic/OpenAI.
4-7x isn't a tiny markup, but how does that compare to high-margin internet businesses like AdSense? Meta and Google do hundreds of billions in ad revenue a year, and after taking out the publisher's portion (60-80% per some searching), I wonder what the ratio of the remaining tens-of-billions is against the compute cost and headcount required to run it.
And how much room for maintaining or improving that margin do they have if the cheap competitors also continue getting better? Is there a "good enough" point where the easier inference tasks are all moving to vendors massively undercutting them, and then they don't have the volume necessary to justify spending on further cutting-edge development?
> Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet.
No it's not. On some rigged paper maybe. Some such benchmarks say all models group together, which they clearly do not.
> Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.
That's not saying much. You can get "cloud" at AWS and you can get a VPS. There is likely a 10x difference. It's not "same". Whilst AWS costs more they also don't have 7x margins similarly.