Comment by sosodev
4 hours ago
Around 20ish tokens a second with 6-bit quant at very long context lengths on my AMD AI Max 395+
I’m trying to use local models whenever possible. Still need to lean on the frontier models sometimes.
4 hours ago
Around 20ish tokens a second with 6-bit quant at very long context lengths on my AMD AI Max 395+
I’m trying to use local models whenever possible. Still need to lean on the frontier models sometimes.
No comments yet
Contribute on Hacker News ↗