Comment by imtringued
3 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
3 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
No comments yet
Contribute on Hacker News ↗