Comment by CamperBob2
2 days ago
Out of curiosity, I just tried Qwen3-30B-A3B-Instruct-2507-Q3_K_S-2.70bpw.gguf (the version they recommend for the Raspberry Pi) on a Blackwell GPU. It cranked out 200+ tokens per second on some private benchmark queries, and it is surprisingly sharp.
It punches well above the weight class expected from 3B active parameters. You could build the bear in Spielberg's "AI" with this thing, if not the kid.
I’m bearish about that kind of future :)