Comment by CalmStorm
5 days ago
For the first time, it introduced native sparse attention into the full training process, achieving up to 11× inference speedup while maintaining model performance.
5 days ago
For the first time, it introduced native sparse attention into the full training process, achieving up to 11× inference speedup while maintaining model performance.
No comments yet
Contribute on Hacker News ↗