Comment by imtringued
10 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
10 months ago
I'm not an expert, but MoE models perform better at continuous learning, because they are less prone to catastrophic forgetting.
No comments yet
Contribute on Hacker News ↗