Comment by pennomi
3 days ago
If your moat is “please don’t copy my outputs”, you don’t have a moat. There is no such thing as a distillation “attack”.
3 days ago
If your moat is “please don’t copy my outputs”, you don’t have a moat. There is no such thing as a distillation “attack”.
How does it differ from pirating music or movies?
According to US AI labs, training on other people's output is fair use. So that's how.
AI training is considered transformational. That's how AI training gets around copyright and it's probably consistent with copyright precedent. For example, indexing the web is considered transformational, even though you can recover the full text of everything in an inverted index.
There is no intellectual property to “pirate”.
Model outputs don't qualify for copyright. They aren’t patented. They aren’t trade secrets - the companies sell them. They aren’t trademarks, obviously. They are nothing, actually.
Machine-extruded text is not copyrightable, since there was no human creativity involved in producing it.
(and if you argue the US models do produce copyrighted works, then oooops - whose copyright is it huh?)
I see. Fair point
That when I pay for a model, the copyright of the output belongs to me. This is as work for hire as it gets.
Ow my head.
How very machiavellianist-libertarian of you.
Don't even try to combine it with any notion of "leadership" then, however, since distillation is literally "copying the actual leader"
It really isn't, you can improve by distilling a weaker model
Self-distillation is also a technique.