Comment by turnsout
18 days ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
18 days ago
It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!
No comments yet
Contribute on Hacker News ↗