Comment by MattRix
6 days ago
I don’t see why this would happen when the modern models already use MoE, which gives them most of the benefits of having specialized models.
6 days ago
I don’t see why this would happen when the modern models already use MoE, which gives them most of the benefits of having specialized models.
No comments yet
Contribute on Hacker News ↗