Comment by kouteiheika
3 days ago
> (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.)
Don't need patience for these, just money. A single RTX 6000 Pro runs those great and super fast.
3 days ago
> (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.)
Don't need patience for these, just money. A single RTX 6000 Pro runs those great and super fast.
> GPT OSS 120B
This one runs at perfectly servicable pace locally on a laptop 5090 with 64gb system ram with zero effort required. Just download ollama and select this model from the drop-down.
Oh... 8 thousand of eurobucks for the thing.
Or 4 thousand for the NVIDIA RTX A6000 which also runs the 120b just fine (quantized).
Or a single AMD Strix Halo with lots of RAM, which could be had before the RAM crisis for ~1.5k eur.
Or why not just buy a blackwell rack?
Runs everything today with bleeding edge performance.
Overall whats the difference between 8k or 30k?
/s
You jest, but there's a ton of people on /r/localLLaMA which have an RTX 6000 Pro. No one has a Blackwell rack.
As long as you have the money this hardware is easily accessible to normal people, unlike fancy server hardware.
[dead]