Comment by 1dom
21 days ago
Yup, confirming what pamcake said, 30b with 3b active.
I have a laptop with a broken screen and an RTX2060 at my disposal. I can run 12b - 14b dense usably, just, although I think 4b - 8b dense models give me the best tradeoff of speed and usefulness.
Larger MOE models with more parameters (20b+) but fewer active (2 - 3b) are sometimes a little bit slower, but are often far more capable.
No comments yet
Contribute on Hacker News ↗