Comment by bee_rider
13 hours ago
That blog post was super interesting. It is neat that he can select experts and control the routing in the model—not having played with the models in detail, tended to assume the “mixing” in mixture of experts was more like a blender, haha. The models are still quite lumpy I guess!
No comments yet
Contribute on Hacker News ↗