Comment by kouteiheika

2 months ago

> (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.)

Don't need patience for these, just money. A single RTX 6000 Pro runs those great and super fast.

7 comments

kouteiheika

> GPT OSS 120B

This one runs at perfectly servicable pace locally on a laptop 5090 with 64gb system ram with zero effort required. Just download ollama and select this model from the drop-down.

Muromec 2 months ago

Oh... 8 thousand of eurobucks for the thing.

cfn 2 months ago

Or 4 thousand for the NVIDIA RTX A6000 which also runs the 120b just fine (quantized).

sofixa 2 months ago

Or a single AMD Strix Halo with lots of RAM, which could be had before the RAM crisis for ~1.5k eur.

Haaargio 2 months ago

Or why not just buy a blackwell rack?

Runs everything today with bleeding edge performance.

Overall whats the difference between 8k or 30k?

kouteiheika 2 months ago
You jest, but there's a ton of people on /r/localLLaMA which have an RTX 6000 Pro. No one has a Blackwell rack.
As long as you have the money this hardware is easily accessible to normal people, unlike fancy server hardware.
- Haaargio 2 months ago
  
  [dead]