Comment by intrasight

1 day ago

Models will get smarter and cheaper. For those that are burned directly into silicon, there will be a market for old models - as the alternative is to dump that silicon in a landfill.

For models that run on general-purpose AI hardware, I don't know why the vendors would waste that resource on old models.

Larger models need more hardware resources to run

And, depending on effort settings, they do more 'thinking', i.e., use more rounds of inference to generate longer internal chains of thought

Both very good reasons to prefer a smaller model, if the small model is good enough for the task

Who says anything about old models? What we're seeing is that as the frontier models get better, we get cheaper, better small models that leverage the advanced but cost a fraction. At the same time, hardware provides morez cheaper options. Sometimes far faster options too (e.g. Cerebras).

In terms of price, I can get 1m output tokens from Deepseek for 40 cents vs. 25 dollars for Opus, and a number of models near the 1-2 dollar mark that are increasingly viable for a larger set of applications.

Providers will keep running those cheaper models as long as there's demand.