Comment by turnsout
1 day ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
1 day ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
No comments yet
Contribute on Hacker News ↗