Comment by trollbridge
10 hours ago
GLM-5.2 performing like it would from a good provider - 8x B200s, so $450k. (No personal experience here)
GLM-5.2, severely quantised, 512GB Mac Studio, somewhere between $10k-$35k for a used M3. Or run it on a CPU with 768GB of RAM by getting an old PowerEdge with DDR4 for around $5,000.
Qwen-3.6-35b-q6, runs well on an RTX 5090 ($4000 + cost of a PC), runs medicore on an Intel Arc B70 ($1000 + cost of a PC plus lots of fiddling to get the setup to work right).
Gemma is a good candidate for the cheaper stuff, but I lack personal experience with using it locally
No comments yet
Contribute on Hacker News ↗