Comment by atq2119
13 hours ago
It's been a while since I saw a detailed paper on a high end training run, but extrapolating from what I remember, it seems those training runs are in the 10s of trillions of tokens. This already accounts for potentially sampling tokens multiple times during the training run.
That seems like a large number, until you realize that OpenAI claims to have almost a billion weekly users. And OpenRouter shows many models at over a trillion tokens per week.
So in pure token terms, I'd say it is in fact extremely plausible that inference dominates, at least for the popular models.
No comments yet
Contribute on Hacker News ↗