Comment by otterley

2 months ago

Out of curiosity, given two code submissions that are completely identical—one written solely by a human and one assisted by AI—why should its provenance make any difference to you? Is it like fine art, where it’s important that Picasso’s hand drew it? Or is it like an instruction manual, where the author is unimportant?

Similarly, would you consider it to be dishonest if my human colleague reviewed and made changes to my code, but I didn’t explicitly credit them?

Why does the provenance make any difference? Let me increase your options. Option 1: You completely hand-wrote it. Option 2: You were assisted by an AI, but you carefully reviewed it. Option 3: You were assisted by an AI (or the AI wrote the whole thing), and you just said, "looks good, YOLO".

Even if the code is line-for-line identical, the difference is in how much trust I am willing to give the code. If I have to work in the neighborhood of that code, I need to know what degree of skepticism I should be viewing it with.

  • That's the thing. As someone evaluating pull requests, should you trust the code based on its provenance, or should you trust it based on its content? Automated testing can validate code, but it can't validate people.

    ISTM the most efficient and objective solution is to invest in AI more on both sides of the fence.

    • In the future, that may be fine. We're not in that future yet. We're still at a place where I don't fully trust AI-only code to be as solid as code that is at least thoroughly reviewed by a knowledgeable human.

      (Yes, I put "AI-only" and "knowledgeable" in there as weasel words. But I think that with them, it is not currently a very controversial case.)

Yes because you can be sued for copyright violation if you don't know the origin of one, and not the other.

  • As an attorney, I know copyright law. (This is not legal advice.) There's nothing about copyright law that says you have to credit an AI coding agent for contributing to your work. The person receiving the code has to perform their due diligence in any case to determine whether the author owns it or has permission from the owner to contribute it.

    • Can you back this up with legal precedence? To my knowledge, nothing of the sort has been ruled by the courts.

      Additionally, this raises another big issue. A few years ago, a couple guys used software (what you could argue was a primitive AI) to generated around 70 billion unique pieces of music which amounts to essentially every piece of copyrightable music using standard music scales.

      Is the fact that they used software to develop this copyrighted material relevant? If not, then their copyright should certainly be legal and every new song should pay them royalties.

      It seems that using a computer to generate results MUST be added as an additional bit of analysis when it comes to infringement cases and fair use if not a more fundamental acknowledgement that computer-generated content falls under a different category (I'd imagine the real argument would be over how much of the input was human vs how much was the system).

      Of course, this all sets aside the training of AI using copyrighted works. As it turns out, AI can regurgitate verbatim large sections of copyrighted works (up to 80% according to this study[0]) showing that they are in point of fact outright infringing on those copyrights. Do we blow up current AI to maintain the illusion of copyright or blow up current copyright law to preserve AI?

      [0] https://arxiv.org/pdf/2603.20957

      1 reply →

    • Late reply, but I meant that the sources stolen by the big AI companies are copyrighted. So the output is tainted by that fact.