Comment by earino
5 days ago
Ok wonderful! Thanks.
I'm trying to set it up right now with lmstudio with qwen3-coder-30b. Hopefully it's going to work. Happy to take any pointers on anything y'all have tried that seemed particularly promising.
5 days ago
Ok wonderful! Thanks.
I'm trying to set it up right now with lmstudio with qwen3-coder-30b. Hopefully it's going to work. Happy to take any pointers on anything y'all have tried that seemed particularly promising.
For sure! We also have a Discord server if you need any help: https://discord.gg/syntheticlab
Follow up question, can the diff apply and fix json models be run locally as well with octofriend, or do they have to hit your servers? Thanks!
They're just Llama 3.1 8b Instruct LoRAs, so yes — you can run them locally! Probably the easiest way is to merge the weights, since AFAIK ollama and llama.cpp don't support LoRAs directly — although llama.cpp has utilities for doing the merge. In the settings menu or the config file you should be able to set up any API base URL + env var credential for the autofix models, just like any other model, which allows you to point to your local server :)
The weights are here:
https://huggingface.co/syntheticlab/diff-apply
https://huggingface.co/syntheticlab/fix-json
And if you're curious about how they're trained (or want to train your own), the entire training pipeline is in the Octofriend repo.
I think this might be your best bet right now. GLM-4.5-Air is probably next best. I'd run them at 8-bit using MLX.