Comment by rasz

9 months ago

Can it reproduce training data? Then its not analysis but compression, lossy compression.

For most LLMs, with most works, no.

If you trained an LLM repeatedly on nothing but the text of LOTR until it could re-produce the books verbatim and then tried to sell copies of that LLM, then I agree that would be blatent copyright infringement, yes.