Comment by turnsout
1 month ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
1 month ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
No comments yet
Contribute on Hacker News ↗