Comment by danielhanchen
7 hours ago
We made Unsloth Studio which should help :)
1. Auto best official parameters set for all models
2. Auto determines the largest quant that can fit on your PC / Mac etc
3. Auto determines max context length
4. Auto heals tool calls, provides python & bash + web search :)
Yea, I actually tried it out last time we had one of these threads. It's undeniably easy to use, but it is also very opinionated about things like the directory locations/layouts for various assets. I don't think I managed to get it to work with a simple flat directory full of pre-downloaded models on an NFS mount to my NAS. It also insists on re-downloading a 3GB model every time it is launches, even after I delete the model file. I probably have to just sit down and do some Googleing/searching in order to rein the software in and get it to work the way I want it to on my system.
Sadly doesn't support fine tuning on AMD yet which gave me a sad since I wanted to cut one of these down to be specific domain experts. Also running the studio is a bit of a nightmare when it calls diskpart during its install (why?)
I applaud that you recently started providing the KL divergence plots that really help understand how different quantizations compare. But how well does this correlate with closed loop performance? How difficult/expensive would it be to run the quantizations on e.g. some agentic coding benchmarks?
Thanks for that. Did you notice that the unsloth/unsloth docker image is 12GB? Does it embed CUDA libraries or some default models that justifies the heavy footprint?
what are you using for web search?
Is unsloth working on managing remote servers, like how vscode integrates with a remote server via ssh?
Lmstudio Link is GREAT for that right now
Great project! Thank you for that!