Comment by reissbaker

6 months ago

I'll add docs! Tl;DR: in the onboarding (or in the Add Model menu section), you can select adding a custom LLM. It'll ask you for your API base URL, which is whatever localhost+port setup you're using, and then an env var to use as an API credential. Just put in any non-empty credential, since local models typically don't actually use authentication. Then you're good to go.

IMO gpt-oss-120b is actually a very competent local coding agent — and it should fit on your 128GB Macbook Pro. I've used it while testing Octo actually, it's quite good for a local model. The best open model in my opinion is zai-org/GLM-4.5, but it probably won't fit on your machine (although it works well with APIs — my tip is to avoid OpenRouter though since quite a few of the round-robin hosts have broken implementations.)

5 comments

reissbaker

earino 6 months ago

Ok wonderful! Thanks.

I'm trying to set it up right now with lmstudio with qwen3-coder-30b. Hopefully it's going to work. Happy to take any pointers on anything y'all have tried that seemed particularly promising.

reissbaker 6 months ago
For sure! We also have a Discord server if you need any help: https://discord.gg/syntheticlab
- earino 6 months ago
  
  Follow up question, can the diff apply and fix json models be run locally as well with octofriend, or do they have to hit your servers? Thanks!
  
  1 reply →
jasonjmcghee 6 months ago

I think this might be your best bet right now. GLM-4.5-Air is probably next best. I'd run them at 8-bit using MLX.