Comment by adrian_b
16 hours ago
I do not know what you mean by sparse models.
All 4 gemma-4-*-it models, regardless whether they are dense models or MoE models, have associated small models for MTP, whose names are obtained by adding the "-assistant" suffix.
https://huggingface.co/google/gemma-4-E2B-it-assistant
https://huggingface.co/google/gemma-4-E4B-it-assistant
No comments yet
Contribute on Hacker News ↗