Comment by hadlock
10 months ago
deepseek-r1:8b screams on my 12gb gpu. gemma3:12b-it-qat runs just fine, a little faster than I can read. Once you exceed GPU ram it offloads a lot of the model to the CPU and splitting between gpu and cpu is dramatically (80? 95%?) slower
No comments yet
Contribute on Hacker News ↗