← Back to context

Comment by renjimen

6 days ago

You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.

An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)

DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.