Comment by Palmik
20 hours ago
I wish there were more open benchmarks comparing different setups and different engines. There are so many knobs to tune (TP / DP / PP / PD / spec. decoding / etc.) and while the optimal setup will be highly dependent on the model, the environment and the traffic, it's likely some useful conclusions could be drawn.
It almost feels like in the past year there is some unwritten agreement between the 3 main open-source engines (vLLM, sglang, TRT-LLM) to not compare to each other directly :) They used to publish benchmarks comparing against each other quite regularly.
No comments yet
Contribute on Hacker News ↗