← Back to context

Comment by echelon

3 days ago

Machine learning assets are not binary "open" or "closed". There is a continuum of openness.

To make a really poor analogy, this repo is like a version of Linux that you can't cross-compile or port.

To make another really poor (but fitting) analogy, this is like an "open core" SaaS platform that you know you'll never be able to run the features that matter on your own.

This repo scores really low on the "openness" continuum. In this case, you're very limited in what you can do with Chatterbox TTS. You certainly can't improve it or fit it to your data.

> You can fine-tune the weights yourself with your own training code.

This will never be built by anyone, and they know that. If it could be, they'd provide it themselves.

If you're considering Chatterbox TTS, just use MegaTTS3 [1] instead. It's better by all accounts.

[1] https://github.com/bytedance/MegaTTS3

> This will never be built by anyone, and they know that. If it could be, they'd provide it themselves.

Community fine-tuning code has been developed in the past for open-weights models without public first-party training code.

Why can't you improve it or fit it to your data?

This can be cross-compiled/ported in the Linux analogy. The Linux analogy would be more like: a kernel dev wrote code for some part of the Linux kernel using JetBrains' CLion. He used features of CLion that made this process much easer than if he had written the code using `nano`. By your logic, the resulting kernel code is not "open" because the tooling used to create it is not open. This is, of course, nonsense.

I agree that the project as a whole is less open than it could be, but the weights are indeed as open as they can be, no scare quotes required.