Comment by dofm
6 hours ago
So far the smallest model I have actually seen behave in a way that feels consistent with the contemporary LLM chat experience is Gemma 4 12B. (The QAT build particularly). The E4B model is not bad — it has a good conversational flow, it responds well if nudged — but the 12B model feels capable.
Nothing below that really seems to be good for anything other than training for specific tasks. I have not been impressed by the earlier Apertus 8B model, which doesn't feel like it really responds to nudges.
I am a strong believer in smaller models, so I might try one of these out of curiosity to see if it might do useful things in limited contexts.
No comments yet
Contribute on Hacker News ↗