Comment by Centigonal

2 days ago

> MAI-Thinking-1 is a 35B-active, ~1T-total parameters, sparse Mixture of Experts model, a smaller inference footprint than much larger models.

This seemingly nonsensical sentence (of course this will have a smaller inference footprint than larger models) suggests this model's competitors have larger inference footprints and total parameter sizes.

3 comments

Centigonal

dr_kiszonka 2 days ago

When would a larger model have a smaller inference footprint? If the larger was MoE and the smaller was dense?

Centigonal 1 day ago
yes, MoE reduces the inference compute requirements (inference memory reqs remain the same)
- rajveerb 19 hours ago
  
  As someone who has spent quite a lot of time on inference, I would a add a small note:
  Deployment looks very different for MoE than dense style models so I would say that it is more nuanced than "inference memory reqs remain the same". Memory can be very different for MoE style models.