Comment by mchiang
7 days ago
I can say trying many inference tools after the launch, many do not have the models implemented well, and especially OpenAI’s harmony.
Why does this matter? For this specific release, we benchmarked against OpenAI’s reference implementation to make sure Ollama is on par. We also spent a significant amount of time getting harmony implemented the way intended.
I know vLLM also worked hard to implement against the reference and have shared their benchmarks publicly.
No comments yet
Contribute on Hacker News ↗