Comment by IncreasePosts
1 day ago
Doesn't that imply just the training process isn't copyrightable? But weights aren't just training, they're also your source data. And if the training set shows originality in selection, coordination, or arrangement, isn't that copyrightable? So why wouldn't the weights also be copyrightable?
The problem is, can you demonstrate that originality of selection and arrangement actually survives in the trained model? It is legally doubtful.
Nobody knows for sure what the legal answer is, because the question hasn’t been considered by a court - but the consensus of expert legal opinion is copyrightability of models is doubtful under US law, and the kind of argument you make isn’t strong enough to change that. As I said, different case for UK law, nobody really needs your argument there because model weights likely are copyrightable in the UK already
> The problem is, can you demonstrate that originality of selection and arrangement actually survives in the trained model? It is legally doubtful.
It's particularly perilous since the AI trainers are at the same time in a position where they want to argue that copyrighted work they included in the training data don't actually survive in the trained model.
For the same reason GenAI output isn't copyrightable regardless of how much time you spend tweaking your prompts.
Also i'm pretty sure none of the AI companies would really want to touch the concept of having the copyright of source data affect the weight's own copyright, considering all of them pretty much hoover up the entire Internet without caring about those copyrights (and IMO trying to claim that they should be able to ignore the copyrights of training data and also that the GenAI output is not under copyright but at the same trying trying to claim copyright for the weights is dishonest, if not outright leechy).
The weights are mathematical facts. As raw numbers, they are not copyrightable.
A computer program is just 0s and 1s. Harry Potter books are just raw letters or raw numbers if an ebook.
(The combination is what makes it copyrightable).
In practice it's not the combination that is copyrighted (you cannot claim copyright over a binary just because you zipped it, or over a movie because you re-encoded it, for instance).
It's the “actual creativity” inside. And it is a fuzzy concept.
`en_windows_xp_professional_with_service_pack_3_x86_cd_vl_x14-73974.iso` is also just raw numbers, but I believe Windows XP was copyrightable
Interesting.
From what I understand, copyright only applies to the original source code, GUI and bundled icon/sound/image files. Functionality etc. would fall under patent law. So the compiled code on your .ISO for example would not only be "just raw numbers" but uncopyrightable raw numbers.