Comment by unleaded

8 hours ago

Qwen3.6-35B-A3B-UD-Q4_K_M runs at about 11 tokens/second on my poor old 1060. Absolutely nuts how far we've come

7 comments

unleaded

I tried running any model on my 1070 and it instantly crashes my old tower, probably time to get off windows and run linux on it.

SV_BubbleTime 5 hours ago
Understated how much of a boon for Linux that AI development has been.
There isn’t any benefit to running a windows machine.
- selectodude 5 hours ago
  
  Au contraire, I run models on WSL and my desktop reliably wakes up from sleep. Best of both worlds.
greenavocado 3 hours ago

Sounds like a hardware issue, though NVIDIA driver issues can't be ruled out, they're much rarer these days

Mind sharing your llama.cpp settings for that?