Comment by Aurornis
10 hours ago
The benchmarks are from the unquantized model they release.
This will only run on server hardware, some workstation GPUs, or some 128GB unified memory systems.
It’s a situation where if you have to ask, you can’t run the exact model they released. You have to wait for quantizations to smaller sizes, which come in a lot of varieties and have quality tradeoffs.
This would likely run fine in just 96 GB of VRAM, by my estimation. Well within the ability of an enthusiastic hobbyist with a few thousand dollars of disposable income.
Quantizations are already out: https://huggingface.co/unsloth/Qwen3.6-27B-GGUF