Comment by jakelazaroff

2 months ago

[flagged]

77 comments

jakelazaroff

I always wonder if the general sentiment toward genai would be positive if we had wealth redistribution mechanisms in place, so everyone would benefit. Obviously that's not the case, but if you consider the theoretical, do you think your view would be different?

jakelazaroff 2 months ago

To be honest, I'm not even sure I'm fully on board with the labor theft argument. But I certainly don't think generative AI is such an unambiguous boon for humankind that we should ignore any possible negative externalities just to advance it.

ThrowawayR2 2 months ago

> "To someone who believes that AI training data is built on the theft of people's labor..."

i.e. people who are not hackers. Many (most?) hackers have been against the idea of copyright and intellectual property from the beginning. "Information wants to be free." after all.

Must be galling for people to find themselves on the same side as Bill Gates and his Open Letter to Hobbyists in 1976 which was also about "theft of people's labor".

anonym29 2 months ago

[flagged]

sirwhinesalot 2 months ago
It's not free. There is a license attached. One you are supposed to follow and not doing so is against the law.
- anonym29 2 months ago
  
  [flagged]
  
  22 replies →
jakelazaroff 2 months ago

> The difference is that people who write open source code or release art publicly on the internet from their comfortable air conditioned offices voluntarily chose to give away their work for free
That is not nearly the extent of AI training data (e.g. OpenAI training its image models on Studio Ghibli art). But if by "gave their work away for free" you mean "allowed others to make [proprietary] derivative works", then that is in many cases simply not true (e.g. GPL software, or artists who publish work protected by copyright).
grandinquistor 2 months ago

What? Over 183K books were pirated by these big tech companies to train their models. They knew what they were doing was wrong.
michaelsshaw 2 months ago

Perhaps you should Google the definition of metaphor before commenting.

refulgentis 2 months ago

[flagged]

mmooss 2 months ago
You're changing the subject. What about the actual point?
- refulgentis 2 months ago
  
  [flagged]
  
  3 replies →
jakelazaroff 2 months ago
I mean, yeah, if you omit any objectionable detail and describe it in the most generic possible terms then of course the comparison sounds tasteless and offensive. Consider that collecting child pornography is also "storing the result of an HTTP GET".
- refulgentis 2 months ago
  
  [flagged]
  
  6 replies →
- ronsor 2 months ago
  
  The objection to CSAM is rooted in how it is (inhumanely) produced; people are not merely objecting to a GET request.
  
  4 replies →

ronsor 2 months ago

> believes that AI training data is built on the theft of people's labor

I mean, this is an ideological point. It's not based in reason, won't be changed by reason, and is really only a signal to end the engagement with the other party. There's no way to address the point other than agreeing with them, which doesn't make for much of a debate.

> an 1800s plantation owner saying "can you imagine trying to explain to someone 100 years from now we tried to stop slavery because of civil rights"

I understand this is just an analogy, but for others: people who genuinely compare AI training data to slavery will have their opinions discarded immediately.

zaptheimpaler 2 months ago
We have clear evidence that millions of copyrighted books have been used as training data because LLMs can reproduce sections from them verbatim (and emails from employees literally admitting to scraping the data). We have evidence of LLMs reproducing code from github that was never ever released with a license that would permit their use. We know this is illegal. What about any of this is ideological and unreasonable? It's a CRYSTAL CLEAR violation of the law and everyone just shrugs it off because technology or some shit.
- shkkmo 2 months ago
  
  You keep conflating different things.
  > We have evidence of LLMs reproducing code from github that was never ever released with a license that would permit their use. We know this is illegal.
  What is illegal about it? You are allowed to read and learn from publicly available unlicensed code. If you use that learning to produce a copy of those works, that is enfringement.
  Meta clearly enganged in copyright enfringement when they torrented books that they hadn't purchased. That is enfringement already before they started training on the data. That doesn't make the training itself enfringement though.
  
  2 replies →
- Alex2037 2 months ago
  
  >We know this is illegal
  >It's a CRYSTAL CLEAR violation of the law
  in the court of reddit's public opinion, perhaps.
  there is, as far as I can tell, no definite ruling about whether training is a copyright violation.
  and even if there was, US law is not global law. China, notably, doesn't give a flying fuck. kill American AI companies and you will hand the market over to China. that is why "everyone just shrugs it off".
  
  4 replies →
- ReflectedImage 2 months ago
  
  All creative types train on other creative's work. People don't create award winning novels or art pieces from scratch. They steal ideas and concepts from other people's work.
  The idea that they are coming up with all this stuff from scratch is Public Relations bs. Like Arnold Schwarzenegger never taking steroids, only believable if you know nothing about body building.
  
  9 replies →
mmooss 2 months ago
[flagged]
- tombert 2 months ago
  
  > It's very much based on reason and law.
  I have no interest in the rest of this argument, but I think I take a bit of issue on this particular point. I don't think the law is fully settled on this in any jurisdiction, but certainly not in the United States.
  "Reason" is a more nebulous term; I don't think that training data is inherently "theft", any more than inspiration would be even before generative AI. There's probably not an animator alive that wasn't at least partially inspired by the works of Disney, but I don't think that implies that somehow all animations are "stolen" from Disney just because of that fact.
  Obviously where you draw the line on this is obviously subjective, and I've gone back and forth, but I find it really annoying that everyone is acting like this is so clear cut. Evil corporations like Disney have been trying to use this logic for decades to try and abuse copyright and outlaw being inspired by anything.
  
  1 reply →
- refulgentis 2 months ago
  
  [flagged]
beepbooptheory 2 months ago

What makes something more or less ideological for you in this context? Is "reason" always opposed to ideology for you? What is the ideology at play here for the critics?
zwnow 2 months ago

> I mean, this is an ideological point. It's not based in reason
You cant be serious