← Back to context

Comment by panzi

10 hours ago

If the output is public domain it's fine as I understand it.

Makes sense to me. But so anybody can take Public Domain code and place it under GNU Public License (by dropping it into a Linux source-code file) ?

Surely the person doing so would be responsible for doing so, but are they doing anything wrong?

  • > Surely the person doing so would be responsible for doing so, but are they doing anything wrong?

    You're perfectly at liberty to relicense public domain code if you wish.

    The only thing you can't do is enforce the new license against people who obtain the code independently - either from the same source you did, or from a different source that doesn't carry your license.

    • This is correct, and it's not limited to code. I can take the story of Cinderella, create something new out of it, copyright my new work, but Cinderella remains public domain for someone else to do something with.

      If I use public domain code in a project under a license, the whole work remains under the license, but not the public domain code.

      I'm not sure what the hullabaloo is about.

      2 replies →

  • Linux code doesn't have to strictly be GPL-only, it just has to be GPL-compatible.

    If your license allows others to take the code and redistribute it with extra conditions, your code can be imported into the kernel. AFAIK there are parts of the kernel that are BSD-licensed.

  • The core thing about licenses, in general, is that they only grant new usage. If you can already use the code because it's public domain, they don't further restrict it. The license, in that case, is irrelevant.

    Remember that licenses are powered by copyright - granting a license to non-copyrighted code doesn't do anything, because there's no enforcement mechanism.

    This is also why copyright reform for software engineering is so important, because code entering the public domain cuts the gordian knot of licensing issues.

  • Sqlite’s source code is public domain. Surely if you dropped the sqlite source code into Linux, it wouldn’t suddenly become GPL code? I’m not sure how it works

    • The Linux kernel would become a GPLv2-licensed derivative work of SQLite, but that doesn’t matter, because public domain works, by definition, are not subject to copyright restrictions.

      Claiming copyright on an unmodified public domain work is a lie, so in some circumstances could be an element of fraud, but still wouldn’t be a copyright violation.

This ruling is IMO/IANAL based on lawyers and judges not understanding how LLMs work internally, falling for the marketing campaign calling them "AI" and not understanding the full implications.

LLM-creation ("training") involves detecting/compressing patterns of the input. Inference generates statistically probable based on similarities of patterns to those found in the "training" input. Computers don't learn or have ideas, they always operate on representations, it's nothing more than any other mechanical transformation. It should not erase copyright any more than synonym substitution.

  • >LLM-creation ("training") involves detecting/compressing patterns of the input.

    There's a pretty compelling argument that this is essentially what we do, and that what we think of as creativity is just copying, transforming, and combining ideas.

    LLMs are interesting because that compression forces distilling the world down into its constituent parts and learning about the relationships between ideas. While it's absolutely possible (or even likely for certain prompts) that models can regurgitate text very similar to their inputs, that is not usually what seems to be happening.

    They actually appear to be little remix engines that can fit the pieces together to solve the thing you're asking for, and we do have some evidence that the models are able to accomplish things that are not represented in their training sets.

    Kirby Ferguson's video on this is pretty great: https://www.youtube.com/watch?v=X9RYuvPCQUA

    • So? Why should it be legal?

      If people find this cool and wanna play with it, they can, just make sure to only mix compatible licenses in the training data and license the output appropriately. Well, the attribution issue is still there, so maybe they can restrict themselves to public domain stuff. If LLMs are so capable, it shouldn't limit the quality of their output too much.

      Now for the real issue: what do you think the world will look like in 5 or 10 years if LLMs surpass human abilities in all areas revolving around text input and output?

      Do you think the people who made it possible, who spent years of their life building and maintaining open source code, will be rewarded? Or will the rich reap most of the benefit while also simultaneously turning us into beggars?

      Even if you assume 100% of the people doing intellectual work now will convert to manual work (i.e. there's enough work for everyone) and robots don't advance at all, that'll drive the value of manual labor down a lot. Do you have it games out in your head and believe somehow life will be better for you, let alone for most people? Or have yo not thought about it at all yet?

  • fortunately, you aren't only operating on representations, right? lemme check my Schopenhauer right quick...