Comment by Orygin
1 day ago
Great article but I don't really agree with their take on GPL regarding this paragraph:
> The spirit of the GPL is to promote the free sharing and development of software [...] the reality is that they are proceeding in a different vector from the direction of code sharing idealized by GPL. If only the theory of GPL propagation to models walks alone, in reality, only data exclusion and closing off to avoid litigation risks will progress, and there is a fear that it will not lead to the expansion of free software culture.
The spirit of the GPL is the freedom of the user, not the code being freely shared. The virality is a byproduct to ensure the software is not stolen from their users. If you just want your code to be shared and used without restrictions, use MIT or some other license.
> What is important is how to realize the “freedom of software,” which is the philosophy of open source
Freedom of software means nothing. Freedoms are for humans not immaterial code. Users get the freedom to enjoy the software how they like. Washing the code through an AI to purge it from its license goes against the open source philosophy. (I know this may be a mistranslation, but it goes in the same direction as the rest of the article).
I also don't agree with the arguments that since a lot of things are included in the model, the GPL code is only a small part of the whole, and that means it's okay. Well if I take 1 GPL function and include it in my project, no matter its size, I would have to license as GPL. Where is the line? Why would my software which only contains a single function not be fair use?
There are many misconceptions of the GPL, gnu, and free software movement. I love the idealism of free software and you hit the nail on the head.
Below are the four freedoms for those who are interested. Straight from the horse's mouth: https://www.gnu.org/philosophy/free-sw.html
> The spirit of the GPL is the freedom of the user, not the code being freely shared.
who do you mean by "user"?
the spirit is that the person who actually uses the software also has the freedom to modify it, and that the users recovering these modifications have the same rights.
is that what you meant?
and while technically that's the spirit of the GPL, the license is not only about users, but about a _relationship_, that of the user and the software and what the user is allowed to do with the software.
it thus makes sense to talk about "software freedom".
last not least, about a single GPL function --- many GPL _libraries_ are licensed less restrictively, LGPL.
I don't think you understand the GPL.
> "the user is allowed to do with the software"
The GPL does not restrict what the user does with the software.
It can be USED for anything.
But it does restrict how you redistribute it. You have responsibilities if you redistribute it. You must provide the source code, and pass on the same freedoms you received to the users you redistribute it to.
Thinking on though, if the models are trained on any GPL code then one could consider that they contain that GPL code, and are constantly and continually updating and modifying that code, thus everything the model subsequently outputs and distributes should come under the GPL too. It’s far from sufficient that, say, OpenAI have a page on their website to redistribute the code they consume in their models if such code becomes part of the model’s training data that is resident in memory every time it produces new code for users. In the spirit of the GPL all that derivative code seems to also come under the GPL, and has to be made available for free, even if upon every request the generated code is somehow novel or unique to that user.
3 replies →
first I thought you'd go into the nuance of gpl2 vs 3 or lgpl vs gpl vs agpl? patents, tivoization, cloud use?
:-)
I agree, I didn't make any statement what you can do with the software as long as you are licensed to use it
you are allowed to build atomic bombs, nuclear power plants, tanks, whatever.
but only as long as you comply i.e. give your downstream the freedom you've received.
if you fail at that, you're no longer allowed to use the software for anything.
see section 8 Termination for details
https://www.gnu.org/licenses/gpl-3.0.html#license-text
> The virality is a byproduct to ensure the software is not stolen from their users.
If Microsoft misappropriates GPL code how exactly is that "stealing" from me, the user, of that code? I'm not deprived in any way, the author is, so I can't make sense of your premise here.
> Freedom of software means nothing.
Software is information. Does "freedom of information" mean nothing? I think you're narrowing concepts here into something not particularly useful or reflective of reality.
> Users get the freedom to enjoy the software how they like.
The freedom is to modify the code for my own purposes. This is not at all required to plainly "enjoy" the software. I instead "enjoy a particular benefit."
> Why would my software which only contains a single function not be fair use?
Because fair use implies educational, informational, or transformational outputs. Your software is none of those things.
"If Microsoft misappropriates GPL code how exactly is that "stealing" from me, the user, of that code? I'm not deprived in any way."
Yes you are. You are just deprived of something you apparently don't recognize or value, but that doesn't make it ok.
The original author was also stolen from and that doesn't rely on your understanding or perception.
The original author set some terms. Therm were not money but they are terms exactly like money. They said "you can have this, and only price is you have to make the source, and the further right to redistribute, available to any user you hand a binary to.
Well MS handed you a binary and did not also hand you the source or the right to redistribute.
That stole from both you and the original author and me who might otherwise have benefited from your own child work. The fact that you personally apparently were never going to make use of something they owe you doesn't change the fact that they owe you, and the original author and me.
It is a tale as old as time, and one which no doubt all of us repeat at some point in our lives. There are hundreds of clichéd books, hundreds of songs, and thousand of letters that echo this sentiment.
We are rarely capable of valuing the freedoms we have never been deprived of.
To be privileged is to live at the quiet centre of a never-ending cycle: between taking a freedom for granted (only to eventually lose it), and fighting for that freedom, which we by then so desperately need.
And as Thomas Paine put it: "Those who expect to reap the blessings of freedom, must, like men, undergo the fatigues of supporting it."
As a user I suffer from not being able to freely use or derive my own work from Microsoft’s
This. People conflate consumer to user. A user in the sense of GPL is a programmer or technical person whom the software (including source) is intended for.
Not necessarily a “user of an app” but a user of this “suite of source code”.
2 replies →
At this point they've contributed a reasonably-fair share of open-source code themselves.
No one benefits from locking up 99.999% of all source code, including most of Microsoft's proprietary code and all GPL code.
No one.
When it comes to AI, the only foreseeable outcome to copyright maximalism is that humans will have to waste their time writing the same old shit, over and over, forever less one day [1], because muh copyright!!!1!
1: https://en.wikipedia.org/wiki/Copyright_Term_Extension_Act
3 replies →
> If Microsoft misappropriates GPL code how exactly is that "stealing" from me, the user, of that code? I'm not deprived in any way, the author is, so I can't make sense of your premise here.
The user in this example is deprived of freedoms 1, 2, and 3 (and probably freedom 0 as well if there are terms on what machines you can run the derivative binary on).
Read more here: https://www.gnu.org/philosophy/free-sw.html
Whether or not the user values these freedoms is another thing entirely. As the software author, licensing your code under the GPL is making a conscious effort to ensure that your software is and always will be free (not just as in beer) software.
The GPL arose from Stallman's frustration at not having access to the source code for a printer driver that was causing him grief.
In a world where he could have just said "Please create a PDP-whatever driver for an IBM-whatever printer," there never would have been a GPL. In that sense AI represents the fulfillment of his vision, not a refutation or violation.
I'd be surprised if he saw it that way, of course.
The safeguards will prevent the AI from reproducing the proprietary drivers for the IBM-whatever printer, and it will not provide code that breaks the DRM that exist to prevent third-party drivers from working with the printer. There will however be no such safeguards or filters to prevent IBM to write a proprietary driver for their next printer, using existing GPL drivers as a building block.
Code will only ever go in one direction here.
Then we'd better stop fighting against AI, and start fighting against so-called "safeguards."
12 replies →
But that isn't the same code that you were running before. And like, let's not forget GPLv3: "please give me the code for a mobile OS that could run on an iPhone" does not in any way help me modify the code running on MY iPhone.
Sure it does. Just tell the model to change whatever you want changed. You won't need access to the high-level code, any more than you need access to the CPU's microcode now.
We're a few years away from that, but it will happen unless someone powerful blocks it.
1 reply →
The only legal way to do that in the proprietary software world is a clean room implementation.
An AI could never do a clean room implementation of anything, since it was not trained on clean room materials alone. And it never can be, for obvious reasons. I don't think there's an easy way out here.
In said hypothetical world, though, the whatever-driver would also have been written by LLMs; and, if the printer or whatever is non-trivial and made by a typical large company, many LLM instances with a sizable amount of token spending over a long period of time.
So getting your own LLM rewrite to an equivalent point (or, rather, less buggy as that's the whole point!) would be rather expensive; at the absolute very least, certainly more expensive than if you still had the original source code to reference or modify (even if an LLM is the thing doing those). Having the original source code is still just strictly unconditionally better.
Never mind the question of how you even get your LLM to reverse-engineer & interact with & observe the physical hardware of your printer, and whatever wasted ink during debugging of the reinvention of what the original driver already did correctly.
Now I'm kind of curious if you give an LLM the disassembly of a proprietary firmware blob and tell it to turn it into human-readable source code, how good is it at that?
You could probably even train one to do that in particular. Take existing open source code and its assembly representations as training data and then treat it like a language translation task. Use the context to guess what the variable names were before the original compiler discarded them etc.
3 replies →