Comment by fooblaster
11 hours ago
what inference runtime are you using? You mentioned mlx but I didn't think anyone was using that for local llms
11 hours ago
what inference runtime are you using? You mentioned mlx but I didn't think anyone was using that for local llms
LM Studio (which prioritizes MLX models if you're on Mac and they are available) - I have it setup with tailscale running as a server on my personal laptop. So when I'm working I can connect to it from my work laptop, from wherever I might be, and it's integrated through the Zed editor using its built in agent - it's pretty seamless. Then whenever I want to use my personal laptop I just unload the model and do other things. It's a really nice setup, definitely happy I got the 128gb mbp because I do a lot of video editing and 3d rendering work as a hobby/for fun and it's sorta dual purpose in that way, I can take advantage of the compute power when I'm not actually on the machine by setting it up as a LLM server.
LM Studio has had an MLX engine and models since 2024.