Comment by NiloCK

6 hours ago

It's not that code is distinct or "less than" art. It's an authority and boundaries question.

I've written a fair amount of open source code. On anything like a per-capita basis, I'm way above median in terms of what I've contributed (without consent) to the training of these tools. I'm also specifically "in the crosshairs" in terms of work loss from automation of software development.

I don't find it hard to convince myself that I have moral authority to think about the usage of gen AI for writing code.

The same is not true for digital art.

There, the contribution-without-consent, aka theft, (I could frame it differently when I was the victim, but here I can't) is entirely from people other than me. The current and future damages won't be born by me.

Alright, if I understand correctly, what you're saying is they make this distinction because they operate in the "text and code" space but not in the media space.

I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.

I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.

  • re: MIT license, I generally tell people they have to credit and that's functionally the only requirement. Are they crediting? That's really the lowest imaginable bar, they're not asked to do ANYTHING else.