Comment by blibble
2 months ago
> And by the way, training your monster on data produced in part by my own hands, without attribution or compensation.
> To the others: I apologize to the world at large for my inadvertent, naive if minor role in enabling this assault.
this is my position too, I regret every single piece of open source software I ever produced
and I will produce no more
That’s throwing the baby out with the bath water.
The Open Source movement has been a gigantic boon on the whole of computing, and it would be a terrible shame to lose that ad a knee jerk reaction to genAI
> That’s throwing the baby out with the bath water.
it's not
the parasites can't train their shitty "AI" if they don't have anything to train it on
You refusing to write open source will do nothing to slow the development of AI models - there's plenty of other training data in the world.
It will however reduce the positive impact your open source contributions have on the world to 0.
I don't understand the ethical framework for this decision at all.
23 replies →
Yes — That’s the bath water. The baby is the all the communal good that has come from FLOSS.
3 replies →
surely that cat's out of the bag by now; and it's too late to make an active difference by boycotting the production of more public(ly indexed) code?
1 reply →
If we end up with only proprietary software we are the one who lose
1 reply →
open source code is a miniscule fraction of the training data
5 replies →
Free software has always been about standing on the shoulders of giants.
I see this as doing so at scale and thus giving up on its inherent value is most definitely throwing the baby out with the bathwater.
7 replies →
It is. If not you, other people will write their code, maybe of worse quality, and the parasites will train on this. And you cannot forbid other people to write open source software.
1 reply →
Open source has been good, but I think the expanded use of highly permissive licences has completely left the door open for one sided transactions.
All the FAANGs have the ability to build all the open source tools they consume internally. Why give it to them for free and not have the expectation that they'll contribute something back?
Even the GPL allows companies to simply use code without contributing back, long as it's unmodified, or through a network boundary. the AGPL has the former issue.
2 replies →
How dare you chastise someone for making the personal decision not to produce free work anymore? Who do you think you are?
The promise and freedom of open source has been exploited by the least egalitarian and most capitalist forces on the planet.
I would never have imagined things turning out this way, and yet, here we are.
FLOSS is a textbook example of economic activity that generates positive externalities. Yes, those externalities are of outsized value to corporate giants, but that’s not a bad thing unto itself.
Rather, I think this is, again, a textbook example of what governments and taxation is for — tax the people taking advantage of the externalities, to pay the people producing them.
1 reply →
Open Source (as opposed to Free Software) was intended to be friendly to business and early FOSS fans pushed for corporate adoption for all they were worth. It's a classic "leopards ate my face" moment that somehow took a couple of decades for the punchline to land: "'I never thought capitalists would exploit MY open source,' sobs developer who advocated for the Businesses Exploiting Open Source movement."
9 replies →
Unfortunately as I see it, even if you want to contribute to open source out of a pure passion or enjoyment, they don't respect the licenses that are consumed. And the "training" companies are not being held liable.
Are there any proposals to nail down an open source license which would explicitly exclude use with AI systems and companies?
All licenses rely on the power of copyright and what we're still figuring out is whether training is subject to the limitations of copyright or if it's permissible under fair use. If it's found to be fair use in the majority of situations, no license can be constructed that will protect you.
Even if you could construct such a license, it wouldn't be OSI open source because it would discriminate based on field of endeavor.
And it would inevitably catch benevolent behavior that is AI-related in its net. That's because these terms are ill-defined and people use them very sloppily. There is no agreed-upon definition for something like gen AI or even AI.
Even if you license it prohibiting AI use, how would you litigate against such uses? An open source project can't afford the same legal resources that AI firms have access to.
I won't speak for all but companies I've worked for large and small have always respected licenses and were always very careful when choosing open source, but I can't speak for all.
The fact that they could litigate you into oblivion doesn't make it acceptable.
Where is this spirit when AWS takes a FOSS project, puts it in the cloud and monetizes it?
It exists, hence e.g. AGPL.
But for most open source licenses, that example would be within bounds. The grandparent comment objected to not respecting the license.
1 reply →
Fairly sure it's the same problem and the main reason stronger licenses are appearing or formerly OSS companies closing down their sources.
you are saying X, but a completely different group of people didn't say Y that other time! I got you!!!!
2 replies →
> Unfortunately as I see it, even if you want to contribute to open source out of a pure passion or enjoyment, they don't respect the licenses that are consumed.
Because it is "transformative" and therefore "fair" use.
Running things through lossy compression is transformative?
1 reply →
Fair use is an exception to copyright, but a license agreement can go far beyond copyright protections. There is no fair use exception to breach of contract.
1 reply →
And then having vibe coders constantly lecture us about how the future is just prompt engineering, and that we should totally be happy to desert the skills we spent decades building (the skills that were stolen to train AI).
"The only thing that matters is the end result, it's no different than a compiler!", they say as someone with no experience dumps giant PRs of horrific vibe code for those of us that still know what we're doing to review.
If you're unhappy that bad people might use your software in unexpected ways, open source licenses were never appropriate for you in the first place.
Anyone can use your software! Some of them are very likely bad people who will misuse it to do bad things, but you don't have any control over it. Giving up control is how it works. It's how it's always worked, but often people don't understand the consequences.
>Giving up control is how it works. It's how it's always worked,
no, it hasn't. Open source software, like any open and cooperative culture, existed on a bedrock, what we used to call norms when we still had some in our societies and people acted not always but at least most of the time in good faith. Hacker culture (word's in the name of this website) which underpinned so much of it, had many unwritten rules that people respected even in companies when there were still enough people in charge who shared at least some of the values.
Now it isn't just an exception but the rule that people will use what you write in the most abhorrent, greedy and stupid ways and it does look like the only way out is some Neal Stephenson Anathem-esque digital version of a monastery.
Open source software is published to the world and used far beyond any single community where certain norms might apply.
If you care about what people do with your code, you should put it in the license. To the extent that unwritten norms exist, it's unfair to expect strangers in different parts of the world to know what they are, and it's likely unenforceable.
This recently came up for the GPLv2 license, where Linus Torvalds and the Software Freedom Conservancy disagree about how it should be interpreted, and there's apparently a judge that agrees with Linus:
https://mastodon.social/@torvalds@social.kernel.org/11577678...
Inside open source communities maybe. In the corporate world? Absolutely not. Ever. They will take your open source code and do what they want with it, always have.
2 replies →
People do not have perfect foresight, and the ways open source software is used has significantly shifted in recent years. As a result, people reevaluating whether or not they want to participate.
Yes, very true.
It's not really people, and they don't really use the software.
People training LLM's on source code is sort of like using newspaper for wrapping fish. It's not the expected use, but people are still using it for something.
As they say, "reduce, reuse, recycle." Your words are getting composted.
1 reply →
It's kind of ironic since AI can only grow by feeding on data and open source with its good intentions of sharing knowledge is absolutely perfect for this.
But AI is also the ultimate meat grinder, there's no yours or theirs in the final dish, it's just meat.
And open source licenses are practically unenforceable for an AI system, unless you can maybe get it to cough up verbatim code from its training data.
At the same time, we all know they're not going anywhere, they're here to stay.
I'm personally not against them, they're very useful obviously, but I do have mixed or mostly negative feelings on how they got their training data.
I learned what i learned due to all the openess in software engineering and not because everyone put it behind a pay wall.
Might be because most of us got/gets payed well enough that this philosophy works well or because our industry is so young or because people writing code share good values.
It never worried me that a corp would make money out of some code i wrote and it still doesn't. AFter all, i'm able to write code because i get paid well writing code, which i do well because of open source. Companies always benefited from open source code attributed or not.
Now i use it to write more code.
I would argue though, I'm fine with that, to push for laws forcing models to be opened up after x years, but i would just prefer the open source / open community coming together and creating just better open models overall.
I've been feeling a lot the same way, but removing your source code from the world does not feel like a constructive solution either.
Some Shareware used to be individually licensed with the name of the licensee prominently visible, so if you had got an illegal copy you'd be able to see whose licensed copy it was that had been copied.
I wonder if something based on that idea of personal responsibility for your copy could be adopted to source code. If you wanted to contribute to a piece of software, you could ask a contributor and then get a personally licensed copy of the source code with your name in every source file... but I don't know where to take it from there. Has there ever been some system similar to something like that that one could take inspiration from?
Why? The core vision of free software and many open source licenses was to empower users and developers to make things they need without being financially extorted, to avoid having users locked in to proprietary systems, to enable interoperability, and to share knowledge. GenAI permits all of this to a level beyond just providing source code.
Most objections like yours are couched in language about principles, but ultimately seem to be about ego. That's not always bad, but I'm not sure why it should be compelling compared to the public good that these systems might ultimately enable.
> and I will produce no more
Nah, don't do that. Produce shitloads of it using the very same LLM tools that ripped you off, but license it under the GPL.
If they're going to thief GPL software, least we can do is thief it back.
That's a weird position to take. Open source software is actually what is mitigating this stupidity in my opinion. Having monopolistic players like Microsoft and Google is what brought us here in the first place.
What a miserable attitude. When you put something out in the world it's out there for anyone to use and always has been before AI.
it is (... was) there to use for anyone, on the condition that the license is followed
which they don't
and no self-serving sophistry about "it's transformative fair use" counts as respecting the license
The license only has force because of copyright. For better or for worse, the courts decide what is transformative fair use.
Characterizing the discussion behind this as "sophistry" is a fundamentally unserious take.
For a serious take, I recommend reading the copyright office's 100 plus page document that they released in May. It makes it clear that there are a bunch of cases that are non-transformative, particularly when they affect the market for the original work and compete with it. But there's also clearly cases that are transformative when no such competition exists, and the training material was obtained legally.
https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
I'm not particularly sympathetic to voices on HN that attempt to remove all nuance from this discussion. It's challenging enough topic as is.
4 replies →
*in your opinion
[flagged]
Was it ever open source if there was an implied refusal to create something you don't approve of? Was it only for certain kinds of software, certain kinds of creators? If there was some kind of implicit approval process or consent requirement, did you publish it? Where can that be reviewed?
> and I will produce no more
Thanks for your contributions so far but this won't change anything.
If you'd want to have a positive on this matter, it's better to pressure the government(s) to prevent GenAI companies from using content they don't have a license for, so they behave like any other business that came before them.
What people like Rob Pike don't understand is that the technology wouldn't be possible at all if creators needed to be compensated. Would you really choose a future where creators were compensated fairly, but ChatGPT didn't exist?
> What people like Abraham Lincoln don't understand is that the technology wouldn't be possible at all if slaves needed to be compensated. Would you really choose a future where slaves were compensated fairly, but plantations didn't exist?
I fixed it... Sorry, I had to, the quote template was simply too good.
"Too expensive to do it legally" doesn't really stand up as an argument.
Unequivocally, yes. There are plenty of "useful" things that can come out of doing unethical things, that doesn't make it okay. And, arguably, ChatGPT isn't nearly as useful as it is at convincing you it is.
Absolutely. Was this supposed to be some kind of gotcha?
> Would you really choose a future where creators were compensated fairly, but ChatGPT didn't exist?
Yes.
I don't see how "We couldn't do this cool thing if we didn't throw away ethics!" is a reasonable argument. That is a hell of a thing to write out.
Yes, very much so. I am in favour of pushing into the future as fast as we can, so to speak, but I think ChatGPT is a temporary boost that is going to slow us in the long run.
Very much yes, how can I opt into that timeline?
Yes, what a wild position to prefer the job loss, devaluation of skills, and environmental toll of AI to open source creators having been compensated in some better manner.
That would be like being able to keep my cake and eat it too. Of course I would. Surely you're being sarcastic?
Uh, yeah, he clearly would prefer it didn’t exist even if he was compensated.
Er... yes? Obviously? What are you even asking?
Yes.
Um, please let your comment be sarcastic. It is ... right?
Yes.
Yes.
Well yeah.
[dead]