Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library

Comment by ed

8 months ago

Interesting direction for research but not a model you’d want to use today. The paper looks at a 3b model built on llama3.2-3b, modified for mamba, and they’re comparing to a distilled version of r1 with 1.5b params.

0 comments

ed

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities