← Back to context

Comment by simonw

8 hours ago

The thing I'm most excited about is the moment that I run a model on my 64GB M2 that can usefully drive a coding agent harness.

Maybe Qwen3.5-35B-A3B is that model? This comment reports good results: https://news.ycombinator.com/item?id=47249343#47249782

I need to put that through its paces.

Yesterday I test ran Qwen3.5-35B-A3B on my MBP M3 Pro with 36GB via LM Studio and OpenCode. I didn’t have it write code but instead use Rodney (thanks for making it btw!) to take screenshots and write documentation using them. Overall I was pretty impressed at how well it handled the harness and completed the task locally. In the past I would’ve had Haiku do this, but I might switch to doing it locally from now on.

I suppose this shows my laziness because I'm sure you have written extensively about it, but what orchestrator (like opencode) do you use with local models?

  • I've not really settled on one yet. I've tried OpenCode and Codex CLI, but I know I should give Pi a proper go.

    So far none of them have be useful enough at first glance with a local model for me to stick with them and dig in further.

    • I've used opencode and the remote free models they default to aren't awful but definitely not on par with Gemini CLI nor Claude. I'm really interested in trying to find a way to chain multiple local high end consumer Nvidia cards into an alternative to the big labs offering.

      1 reply →

    • When you say you use local model in OpenCode, do you mean through the ollama backend? Last time I tried it with various models, I got issues where the model was calling tools in the wrong format.

      1 reply →