Comment by Aurornis

10 hours ago

The benchmarks are from the unquantized model they release.

This will only run on server hardware, some workstation GPUs, or some 128GB unified memory systems.

It’s a situation where if you have to ask, you can’t run the exact model they released. You have to wait for quantizations to smaller sizes, which come in a lot of varieties and have quality tradeoffs.

2 comments

Aurornis

bityard 8 hours ago

This would likely run fine in just 96 GB of VRAM, by my estimation. Well within the ability of an enthusiastic hobbyist with a few thousand dollars of disposable income.

Quantizations are already out: https://huggingface.co/unsloth/Qwen3.6-27B-GGUF