Comment by StrLght
4 hours ago
I was also surprised by this sentence. It sounds like this is the author's first attempt at running models locally.
Or maybe the author has been running heavily quantized small models all that time — Gemma 4 gguf he's using is Q4 and only 16 GB. In my experience quants like this tend to perform much worse.
No comments yet
Contribute on Hacker News ↗