Comment by hamburglar
1 day ago
Not having a lot of experience with this, I ask a naive question: is there a world where you can take your local LLM and hook it up to Claude and get more Claude-like results from your local model? Obviously, there are going to be material differences in how these perform, but are we getting close to a place where this is viable? I imagine that the answers are a combination of “not yet” and “yes but it’s a lot slower” and “yes but there is actually little point to doing this because ‘what Claude gets you’ is highly baked into anthropic’s models and that’s part of what you’re paying for.”
Already been done. Look at the Forge project for local LLMs. It can bring 8b models up to Opus-like performance at tool calling.
You can use ollama as the backend for claude code!
I would characterize it as doable, but not really viable. It's "yes you can do it but it's a lot slower", with a hint of "and the best local LLMs are on par with Haiku or Maybe Sonnet so larger and longer tasks get notably worse".
I have a "task router" that is a small local LLM on my mac mini (Qwen 3.5 0.8B) that I use to decide (when activated) with Pi whether to route a given task to my local LLM (Step 3.7 Flash) or to <given cloud provider>, if that counts? It works surprisingly well really. Though some of the cloud providers are getting so good and so cheap (GLM 5.1/5.2, MiniMax M3, among others) that the need to use my local one becomes less and less relevant, depressingly!
You're kinda talking about Claude being used for planning/architect role, while local LLM is just executing it (performing edits) -- at least in such form it's a thing, yes.
opencode is like Claude code, but you can use any model.