← Back to context

Comment by 20k

7 days ago

Its astonishing the way that we've just accepted mass theft of copyright. There appears to be no way to stop AI companies from stealing your work and selling it on for profits

On the plus side: It only takes a small fraction of people deliberately poisoning their work to significantly lower the quality, so perhaps consider publishing it with deliberate AI poisoning built in

In practice, the real issue is how slow and subjective the legal enforcement of copyright is.

The difference between copyright theft and copyright derivatives is subjective and takes a judge/jury to decide. There’s zero possibility the legal system can handle the bandwidth required to solve the volume of potential violations.

This is all downstream of the default of “innocent until proven guilty”, which vastly benefits us all. I’m willing to hear out your ideas to improve on the situation.

Would publishing under AGPL count as poisoning? Or even with an explicit "this is not licensed" license

  • Your licensing only matters if you are willing to enforce it. That costs lawyer money and a will to spend your time.

    This won’t be solved by individuals withholding their content. Everything you have already contributed to (including GitHub, StackOverflow, etc) has already been trained.

    The most powerful thing we can do is band together, lobby Congress, and get intellectual property laws changes to support Americans. There’s no way courts have the bandwidth to react to this reactively.

> There appears to be no way to stop AI companies from stealing your work and selling it on for profits

there is a way, just stop publishing anything and everything

small website you wrote to solve a minor tech problem for your partner/kids? keep it to yourself

helpful script you wrote to solve your problem? keep it to yourself

Eh, the Internet has always been kinda pro-piracy. We've just ended up with the inverse situation where if you're an individual doing it you will be punished (Aaron Scwartz), but if you're a corporation doing it at a sufficiently large scale with a thin figleaf it's fine.

  • While it was pro-piracy, nobody did deliberately closed GPL or MIT code because there was an unwritten ethical agreement between everyone, and that agreement had benefits for everyone.

    The batch has spoiled when companies started to abuse developers and their MIT code for exposure points and cookies.

    ...and here we are.

    • One of the main points of the GPL was to prevent software from being siphoned up and made part of proprietary systems.

      I personally disagree with the rulings thus far that AI training on copyrighted information is "fair use", not because it's not true for human training, but because I think that the laws were neither written nor wielded with anyone but humans in mind.

      As a comment upstream a bit said, some people are now rethinking even releasing some material into the public, out of not wanting it to be trained by AI. Prior to a couple of years or so ago, nearly nobody was even remotely thinking about that; we could have decades of copyrighted material out there that, had the authors understood present-day AI, they wouldn't have even released it.