← Back to context

Comment by knollimar

3 months ago

In regards to "Because the RNG wasn't written "based on" the copy righted content."

Does that mean I can distribute the seed if I find one and this RNG wasn't trained on that content?

Does it prevent me from sharing that number on the internet?

It seems like theres a lot of subjective intent here that I'm extremely skeptical

For an LLM also:

If it's lossy enough that it needs RAG to fix the results is that okay?

-------------------

In my opinion I think actually getting the output is where the infringement happens. Having and distributing the LLM weights shouldn't be infringment (in my head) because of the enforcability of results. Otherwise you risk banning RNGs or them all being forced to prove they didn't train on copyrighted content

> If it's lossy enough that it needs RAG to fix the results is that okay?

but then the only way RAG can "fix" the result is if the RAG system stored the song text in it's vector data base

in which case the law case and solutions to fix the issue are much more clear

in a certain way a a LLM which only encodes language but now knowledge and then uses RAG and similar is the most desirable (not just for copyright reasons but also e.g. update-ability, traceability, remove-ability of misinformation etc.)

sadly AFIK it doesn't work as language and knowledge details are too much interleaved

> Does that mean I can distribute the seed if I find one and this RNG wasn't trained on that content?

honestly I think this falls outside of situations copyright law considers. But also if you consider that copyright law mostly doesn't care about technical implementation details and that the "spirit of law" (intent of law maker) matters if unclear cases I think I also have a best guess answer:

Neither the RNG nor the seed by them self are a copyright violation but if you spread them with the intend to spread non licensed copy you still do a copyright violation and in that context the seed might be idk. taken down from sharing sites even if by itself it isn't a copyright violation.

The thing is in the end you can transform _any_ digital content into

- "just a number"

- or "just a equation", "equation system" etc.

- or an image, matrix, graph, human readable text , or pretty much anything

so fundamentally you can't have a clean cut between what can and can't be a copyright violation

which is why it matters so much that law acts on a higher abstraction level then what exactly technical happens.

And why intent of law (in gray area cases) matters so much.

And why law really shouldn't be a declarative definition of strict mathematics rules.