← Back to context

Comment by ashdksnndck

1 hour ago

The narrative is that inference on existing models is profitable. All of the profits and many billions of additional capital invested go into training the next model, which is some multiple more expensive to train than the last. Each new model generation also leads to more revenue growth, mainly due to higher capabilities. Newer models are more compute-efficient when distilled (so could possibly be higher margin) but also they work on longer time-horizon tasks and can make greater use of test-time compute which increases token counts. So the inference ROI on each model can pay back the cost of training it, but future growth demands put all that money and more into training the next model. The numbers we’d need to prove whether this is true are not public, but it makes sense and fits what info we do have.

Theoretically, if training more expensive models stops resulting in better capabilities or isn’t economically viable, the labs can shift gears into making profit on old models. A lot of future growth is priced in so this would lead to a collapse in share price if it happens anytime soon.

There’s a story out that Anthropic might be profitable this quarter. This is in one sense bad news - it means that the company wasn’t aggressive enough about acquiring capacity last year, because they didn’t foresee how fast their inference business would grow. Anthropic is now forced to make suboptimal choices about serving existing users vs. training the next model (need to scrounge for capacity by paying other players like SpaceX). And as a Claude Code user I feel like I’ve been affected by that, what with the random outages and performance degradations.