Comment by meowface

6 days ago

I agree this is a bad "collective punishment" argument from him, even if I think he's somewhat right in spirit because I as a software dev don't care in the slightest about LLMs training on code, text, videos, or images and fully believe it's equivalent to humans perceiving and learning from the output of others and I know many or most software devs agree on that point while most artists don't.

I think artists are very cavalier about IP, on average. Many draw characters from franchises that do not allow such drawing, and often directly profit by selling those images. Do I think that's bad? No. (Unless it's copying the original drawing plagiaristically.) Is it odd that most of the people who profit in this way consider generative AI unethical copyright infringement? I think so.

I think hypocrisy on the issue is annoying. Either you think it's cool for LLMs to learn from code and text and images and videos or you don't think any of it is fine. tptacek should bite one bullet or the other.

I don't accept the premise that "training on" and "copying" are the same thing, any more than me reading a book and learning stuff is copying from the book. But past that, I have, for the reasons stated in the piece, absolutely no patience for software developers trying to put this concern on the table. From my perspective, they've forfeited it.

  • > I don't accept the premise that "training on" and "copying" are the same thing...

    Nor do I. Training and copying are clearly different things... and if these tools had never emitted -verbatim- nontrivial chunks of the code they'd ingested, [0] I'd be much less concerned about them. But as it stands now, some-to-many of the companies that build and deploy these machines clearly didn't care to ensure that their machines simply wouldn't plagiarize.

    I've a bit more commentary that's related to whether or not what these companies are doing should be permitted here. [1]

    [0] Based on what I've seen, when it happens, it is often with either incorrect copyright and/or license notifications, or none of the verbiage the license of the copied code requires in non-trivial reproductions of that code.

    [1] <https://news.ycombinator.com/item?id=44166983>

  • Who is this "they" who have forfeited it?

    What about the millions of software developers who have never even visited a pirate site, much less built one?

    Are we including the Netflix developers working actively on DRM?

    How about the software developers working on anti-circumvention code for Kindle?

    I'm totally perplexed at how willing you are to lump a profession of more than 20 million people all into one bucket and deny all of them, collectively, the right to say anything about IP. Are doctors not allowed to talk about the society harms of elective plastic surgery because some of them are plastic surgeons? Is anyone with an MBA not allowed to warn people against scummy business practices because many-to-most of them are involved in dreaming those practices up?

    This logic makes no sense, and I have to imagine that you see that given that you're avoiding replying to me.