Comment by imtringued
6 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
6 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
No comments yet
Contribute on Hacker News ↗