Comment by samagra14
23 days ago
Sounds interesting! I love all these edge experiments. But as long as there is architecture dependent code for models, I feel these edge experiments can't fully express their strong suit.
You try to run something and Voila you need Ampere or Hopper or Laplace for flash attnt.
No comments yet
Contribute on Hacker News ↗