Comment by Voloskaya
13 hours ago
People said the same thing for deepseek-r1, and nothing changed.
If you come up with a way to make the current generation of models 10x more efficient, then everyone just moves to train a 10x bigger model. There isn’t a size of model where the players are going to be satisfied at and not go 10x bigger. Not as long as scaling still pays off (and it does today).
No comments yet
Contribute on Hacker News ↗