Comment by sourcecodeplz

1 month ago

Running LLMs local is fun and powerful but if you want to get work done... it is a big headache. You have to pre-plan and plan, and make specs, etc... The big OpenAI, Claude models just get you with just a few sentences..

Yup, especially when for a lot of us, the price of the frontier subscription has become a cost of doing business over the last 6 months.

If you're already doing big boy stuff with big boy models, then... just carry on trucking!

Only place I'd differ is for vision/OCR tasks. Small/medium open weights models are as good as SoTa, and token prices for prefill are kinda very not worth it for larger batch tasks.

Other thing that people forget is, if you want to have even a smallish LLM as a reliable personal service, you've got to carve out 16-24 of (V)RAM and leave it permanently running.

It's actually technically easy now to run a large model at home for offline use (thanks to the Chinese who release their top-notch models).

The main problem is finding the money :/