Comment by twobitshifter
15 hours ago
We have GPU costs, power costs, and how many token/s models can generate on those GPUs. It’s possible to figure out the marginal cost based on this. The current estimate is about $0.40 per million tokens for gpt4 equivalent model. Sonnet 4 is $15 per million tokens, so they are charging high margins on inference. The issue is how large of a margin is needed to recover their costs before the GPUs age out, and how high of a margin can be charged before it’s not economically viable.
That seems way off to me.
I skimmed the article, but couldn’t spot any details on their estimates. They mention 70b+ params as being large in several places. But we’ve had several 100b+ param models that trail Sonnet.