Comment by elif
1 day ago
It's the same thing. Quantize your parameters? "Bigger" model runs faster. MOE base model distillation? "Bigger" model runs as smaller model.
There is no gain for anyone anywhere by reducing parameter count overall if that's what you mean. That sounds more like you don't like transformer models than a real performance desire
No comments yet
Contribute on Hacker News ↗