Comment by manmal
1 year ago
The top comment in this thread mentions a 6k setup, which likely could be used with vLLM with more tinkering. AFAIK vLLM‘s batched inference is great.
1 year ago
The top comment in this thread mentions a 6k setup, which likely could be used with vLLM with more tinkering. AFAIK vLLM‘s batched inference is great.
No comments yet
Contribute on Hacker News ↗