Comment by swyx
1 year ago
why are you comparing Claude 3, a ~14b and ~>200b model, to Gemma, a 2-7B model? of course it's going to do worse. the question for smol models is can it do good enough given a performance budget.
1 year ago
why are you comparing Claude 3, a ~14b and ~>200b model, to Gemma, a 2-7B model? of course it's going to do worse. the question for smol models is can it do good enough given a performance budget.
No comments yet
Contribute on Hacker News ↗