Comment by danielhanchen

17 hours ago

Oh I didn't expect this to be on HN haha - but yes for our new benchmarks for Qwen3.5, we devised a slightly different approach for quantization which we plan to roll out to all new models from now on!

3 comments

danielhanchen

nnx 16 hours ago

Can you describe what is this slightly different approach and why it should work on all models?

hedora 11 hours ago

Nice! Your stuff ran LLMs extremely well on < $500 boxes (24-32GB ram) with iGPUS before this update.

I’m eager to try it out, especially if 16GB is viable now.

gundmc 1 hour ago

The 5080 is 16GB VRAM, not system memory. I don't think you can get 24-32GB VRAM in a $500 box