← Back to context

Comment by heavyset_go

2 days ago

> Why would you be unwilling to merge AI code at all?

Because AI code cannot be copyrighted. It is not anyone's IP. That matters when you're creating IP.

edit: Assuming this is a real person I'm responding to, and this isn't just a marketing gimmick, having seen the trail you've left on the internet over the past few weeks, it strikes me of mania, possibly chatbot-induced. I don't know what I can say that could help, so I'm dropping out of this conversation and wish you the best.

This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

The main reason for being unwilling to merge AI code is going to be that it sets a precedent that AI code is acceptable. Suddenly, maintainers need to be able to make judgement calls on a case-by-case basis of what constitutes an acceptable AI contribution, and AI is going to be able to generate far more slop than people will ever have the time to review and agree upon.

  • > This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

    This depends on what courts find, at least one non-precedent setting case found model training on basically everyone's IP without permission to be fair use. If it's fair use, consent isn't needed, licenses don't matter and the only way to prevent training on your content is to withhold it and gate it behind contracts that forfeit your clients' rights to fair use.

    But that is beside the point, even if what you claim was the case, my point is that AI output isn't property. It's not property whether its training corpus was full of licensed or unlicensed content. This is what the US Copyright Office determined.

    If you include AI output in your product, that part of it isn't yours. It isn't anybody's, so anyone can copy it and anyone can do whatever they want with it, including the AI/cloud providers you allowed your code to get slurped up to as context to LLMs.

    You want to own your IP, you don't want to say "we own 30% of the product we wrote, but 70% of it is non-property that anyone can copy/use/sell with no strings attached, even open source licenses". This matters if you're building a business or open source project. If you include AI code in your open source project, that part of the project isn't covered by your license. LLMs can't sign CLAs and they can't produce intellectual property that can be licensed or owned. The more of your project that is developed by AI, the more it is not yours, and the more of it cannot be covered by your open source license of choice.

    • > This is what the US Copyright Office determined.

      There are hundreds of countries in the world. Whatever the "US Copyright Office" determines, applies to only one of them.

      1 reply →

  • > This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

    Stares at facebook stealing terabytes of copyrighted content to train their models

    Also, even if code is trained only on FLOSS approved licenses, GPL based ones have some caveats that would disqualify many projects with including code