Comment by hacker_homie
4 hours ago
Maybe I missed the party, but it feels like it's just starting.
I have only been running local models and we are finally at the point with gemma4 and Qwen3.5 where they can start doing coding work.
And the quota can't change.
I am surprisingly optimistic about local LLMs. Their progress (especially with regards to distillation) over the last year has been remarkable. Qwen 3.5 is amazing for what it is. It think it's production capable - for many use cases, but not all. It does require more careful alignment of instructions, and offers a smaller context (even with very large unified memory). But with some care, one can code all day, every day, without limits. The Mac Mini 64GB is probably sufficient for Qwen 3.5 35B. Go larger for larger contexts.
Of course it's not as easy as pointing February Opus 4.6 at a folder and giving it one-sentence instructions.
The only viable future-proof solution to this hellscape is what you mention, local models and/or corporate models for work.