Comment by zozbot234
6 days ago
Using mmap for model parameters allows you to run vastly larger models for any given amount of system RAM. It's especially worthwhile when you're running MoE models and parameters for unused "experts" can just be evicted from RAM, leaving room for more relevant data. But of course this applies more generally to, e.g. single model layers, etc.
No comments yet
Contribute on Hacker News ↗