Comment by layer8
7 months ago
The premise was that the model is able to reproduce the memorized text, and that what saved Anthropic was them having server-side filtering to avoid reproducing that text. So the presumption is that without those filters, the model would be able to reproduce text substantial enough to constitute a copyright violation (otherwise they wouldn’t need the filter argument). Distributing a “machine” producing such output would constitute copyright infringement.
Maybe this is a misrepresentation of the actual Anthropic case, I have no idea, but it’s the scenario I was addressing.
> Distributing a “machine” producing such output would constitute copyright infringement.
This is the thing you haven't established.
Any ordinary general purpose computer is a "machine" that can produce copyrighted text, if you tell it to. But isn't it pretty important whether you actually do that with it or not, since it's a general purpose tool that can also do a large variety of other things?