Comment by fennecfoxy
8 days ago
Probably offload to regular ram I'd wager. Or really, really, reaaaaaaally quantized to absolute fuck. Qwen3:30B-A3B Q1 with a 1k Q4 context uses 5.84GB of vram.
8 days ago
Probably offload to regular ram I'd wager. Or really, really, reaaaaaaally quantized to absolute fuck. Qwen3:30B-A3B Q1 with a 1k Q4 context uses 5.84GB of vram.
No comments yet
Contribute on Hacker News ↗