← Back to context

Comment by NitpickLawyer

16 hours ago

I've been using local LLMs since before chatgpt launched (gpt-j, gpt-neox for those that remember), and have tried all the promising models as they launch. While things are improving faster than I thought ~3 years ago, we're still not there in terms of 1-1 comparison with the SotA models. For "consumer" local at least.

The best you can get today with consumer hardware is something like devstral2-small(24B) or qwen-coder30b(underwhelming) or glm-4.7-flash (promising but buggy atm). And you'll still need beefy workstations ~5-10k.

If you want open-SotA you have to get hardware worth 80-100k to run the big boys (dsv3.2, glm4.7, minimax2.1, devstral2-123b, etc). It's ok for small office setups, but out of range for most local deployments (esp considering that the workstations need lots of power if you go 8x GPUs, even with something like 8x 6000pro @ 300w).