Comment by imtringued
3 days ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
3 days ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
No comments yet
Contribute on Hacker News ↗