← Back to context

Comment by TomasBM

9 days ago

Thanks for replying.

That's fair. I completely agree that much of LLM training was (and still very much is) in violation of many licenses. At the very least, the fact that the source of training data is obfuscated even years after the training, shows that developers didn't care about attribution and licenses - if they didn't deliberately violate them outright.

Your conditions make sense. If I had anything I thought was too valuable or prone to be blatantly stolen, I would think thrice about whom I share it with.

Personally, ever since discovering FOSS, I realized that it'd be very difficult to enforce any license. The problem with public repositories is that it's trivial for those not following the gentleman's agreement to plagiarize the code. Other than recognizing blatant copy-pasting, I don't know how I'd prevent anyone from just trivially remixing my content.

Instead, I changed to seeing FOSS like scientific contributions:

- I contribute to the community. If someone remixes my code without attribution, it's unfair, but I believe that there are more good than bad contributors.

- I publish stuff that I know is personally original, i.e., I didn't remix without attribution. I can't know if some other publisher had the same idea in isolation, or remixed my stuff, but over time, provenance and plagiarism should become apparent over multiple contributions, mine and theirs.

- I don't make public anything that I can see my future self regretting. At the same time, I've always seen my economic value in continuous or custom work, not in products themselves. For me, what I produce is also a signal of future value.

- I think bad faith behavior is unsustainable. Sure, power delays the consequences, but I've seen people discuss injustice and stolen valor from centuries ago, let alone recent examples.