Comment by boroboro4
3 months ago
Chosen expert (on each layer) depends on the input of previous layer. Not sure how you can preload the experts before forward pass.
3 months ago
Chosen expert (on each layer) depends on the input of previous layer. Not sure how you can preload the experts before forward pass.
No comments yet
Contribute on Hacker News ↗