Comment by zaptrem
2 months ago
"You can train a SOTA LLM for $0.50" (as long as you're distilling a model that cost $500m into another pretrained model that cost $5m)
2 months ago
"You can train a SOTA LLM for $0.50" (as long as you're distilling a model that cost $500m into another pretrained model that cost $5m)
The original statement stands, if what you are suggesting in addition to it is true. If the initial one-time investment of $505m is enough to distill new SOTA models for $0.50 a piece, then the average cost for subsequent models will trend toward $0.50.
That's absolutely fantastic, because if you have 1 good idea that's additive to the SOTA, you can test it for a dollar, not millions