Comment by andsoitis

14 hours ago

Is it any good?

I haven't tried it for anything myself yet. The paper provides several benchmarks. The emphasis during training was on multi-language support (over 1800 languages are represented in its pre-training data, which is 40% non-English) and non-copyrighted training data... and the benchmarks seem to suffer for it.

https://arxiv.org/abs/2509.14233

Yes it’s not bad, although it’s not meant to be a chatbot, post training is limited, so it won’t feel as smooth as TOTL of course. The number of supported languages is mind boggling.

Focus was on open data, languages and auditability.

Their loss function is fancy, not sure about the effects