Comment by kccqzy
5 days ago
They are probably hoping that someone else will distill it into smaller models, much like DeepSeek released a giant 671B model but there are useful distillations down to 30B.
5 days ago
They are probably hoping that someone else will distill it into smaller models, much like DeepSeek released a giant 671B model but there are useful distillations down to 30B.
No comments yet
Contribute on Hacker News ↗