Comment by Saris
2 hours ago
You can try tweaking MoE offload, I found the sweet spot after a few tries and even changing it by 1 can reduce speed by a few tok/s. I think around 45 is the average I get but sometimes it'll hit 50.
2 hours ago
You can try tweaking MoE offload, I found the sweet spot after a few tries and even changing it by 1 can reduce speed by a few tok/s. I think around 45 is the average I get but sometimes it'll hit 50.
No comments yet
Contribute on Hacker News ↗