Comment by FrasiertheLion

9 months ago

We’re already using vllm as our inference server for our standard models. We can run whatever inference server for custom deployments.

0 comments