Comment by embedding-shape
2 days ago
Man, really? Why, just why? If it's similar, why not just the same? It's like they're purposefully adding more work for the ecosystem to support their special model instead of just trying to add more value to the ecosystem.
The parser is a small part of running an LLM, and Zai's format is superior to Harmony: it avoids having the model escape JSON in most cases by using XML, so e.g. long code edits are more in-domain compared to pretraining data (where code is typically not nested in JSON and isn't JSON-escaped). FWIW almost everyone has their own format.
Also, Harmony is a mess. The common API specs adopted by the open-source community don't have developer roles, so including one is just bloat for the Responses API no one outside of OpenAI adopted. And why are there two types of hidden CoT reasoning? Harmony tool definition syntax invents a novel programming language that the model has never seen in training, so you need even more post-training to get it to work (Zai just uses JSON Schema). Etc etc. It's just bad.
Re: removing newlines from their old format, it's slightly annoying, but it does give a slight speed boost, since it removes one token per call and one token per argument. Not a huge difference, but not nothing, especially with parallel tool calls.
Sometimes worse is better, I don't really care what the specific format is, just that providers/model releasers would use more of the same, because compatibility sucks when everyone has their very own format. Conveniently for them, it gets harder to compare models when everyone has different formats too.