← Back to context

Comment by plufz

1 day ago

I need this too, and looked quite a lot on it a year ago. I haven’t had time to check out the recent developments with Docker Model Runner (vllm-metal) or podman libkrun. Did neither of those work for you?

vllm-metal isn't GPU access but rather a openai compatible end point which I can already do via lm studio endpoint over network

>podman libkrun

Haven't tried it but research suggests its really shaky still. podman libkrun exposes vulkan while torch expects mps on macs. Sounds like one can force vulkan but that's apparently slow and beta-ish?