← Back to context

Comment by kouteiheika

3 days ago

> (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.)

Don't need patience for these, just money. A single RTX 6000 Pro runs those great and super fast.

> GPT OSS 120B

This one runs at perfectly servicable pace locally on a laptop 5090 with 64gb system ram with zero effort required. Just download ollama and select this model from the drop-down.

Or a single AMD Strix Halo with lots of RAM, which could be had before the RAM crisis for ~1.5k eur.

Or why not just buy a blackwell rack?

Runs everything today with bleeding edge performance.

Overall whats the difference between 8k or 30k?

/s

  • You jest, but there's a ton of people on /r/localLLaMA which have an RTX 6000 Pro. No one has a Blackwell rack.

    As long as you have the money this hardware is easily accessible to normal people, unlike fancy server hardware.