← Back to context

Comment by somethingsome

20 days ago

I was hoping it would work with vLLM (openai compatible) to test it, does anyone know a similar proxy for local coding models?

4 comments

somethingsome

Reply

DeathArrow 20 days ago

Check this: https://github.com/antoinezambelli/forge/tree/az/vllm

zambelli 20 days ago
Yeah I got it working as a quick test run to confirm a model issue vs backend issue on a consumer app. It worked on my dual-5070 Ti rig, but I didn't have time to formalize all the way and merge it in. Thanks for linking it!
- somethingsome 20 days ago
  
  Thanks, I just tried, for me it worked on 2x L40S with vLLM. I had some issues due to the model name, forge was forwarding 'default' instead of the real model name 'Qwen2.5-Coder-14B-Instruct'.
  If someone else struggle on this step, I added in vLLM args: --served-model-name "Qwen2.5-Coder-14B-Instruct" --served-model-name "default"
  So default becomes an alias.
  I didn't yet test Forge, I was just happy that it worked at the moment ;)
  
  1 reply →