← Back to context

Comment by dofm

6 hours ago

So far the smallest model I have actually seen behave in a way that feels consistent with the contemporary LLM chat experience is Gemma 4 12B. (The QAT build particularly). The E4B model is not bad — it has a good conversational flow, it responds well if nudged — but the 12B model feels capable.

Nothing below that really seems to be good for anything other than training for specific tasks. I have not been impressed by the earlier Apertus 8B model, which doesn't feel like it really responds to nudges.

I am a strong believer in smaller models, so I might try one of these out of curiosity to see if it might do useful things in limited contexts.