← Back to context

Comment by CrossVR

20 hours ago

There's one elephant in the room that's not being addressed:

Training an AI on GPL code and then having it generate equivalent code that is released under a closed source license seems like a good way to destroy the copy-left FOSS ecosystem.

People were violating the terms of GPL without consequence long before AI. It is very difficult to determine if binaries were compiled from fragments of GPL code.

The places I have found AI most useful in coding is stripping away layers of abstraction. It is difficult to say as a long time open source contributor, but libraries often tried to cater to everyone and became slow, monolithic piles of abstraction. All the parts of an open source project that are copyrightable are abstraction. When you take away all the branching and make a script that performs all the side effects that some library would have produced for a specific set of args, you are left with something that is not novel. It’s quite liberating to stop fighting errors deep in some UVC driver, and just pull raw bytes from a USB device without a mountain of indirection from decades of irrelevant edge case handling.

This is 100% already happening. No need to worry about licensing or dependencies any more, just have the LLM launder it into a plausibly different structure!

  • This kind of reminds me how I saw some teams deal with a vulnerability scanner flagging an OSS dependency as having a reported vulnerability. The dependency was always OSS anyways. Copy & paste the entire thing into your project. Voila, dependency scanner doesn't find any problems any longer.

If it is clearly the same code, the copyright would apply to the copy. If it is meaningfully different it does not.

This is what it was before AI, and it remains so today.

AI reproducing code without holding rights to it are a failure case that should be eliminated,

IP laundering is a big part of AI, it's why big companies are so excited and workers / artists are less excited