Comment by gspr
11 hours ago
> I'm not sure where in our lawbooks there are laws that specifically target humans to the exclusion of human-operated tools.
If we take the point of view that LLMs are tools (I agree), then people need to be absolutely certain that these tools don't contain (compressed) representations of copyrighted works.
People seem not to want to do that. And they argue that the LLMs have "learned" or "been inspired" by the copyrighted works, which is OK for humans.
This is the problem. People can't even agree on which of two mutually exclusive defenses to appeal to! Are LLMs tools which we have to ensure aren't used to reproduce copyrighted work without permission, or are they entities that can be granted exemptions like humans can? It can't be both!
> There's also a TON of irony here. What an about face it is, for the community at large* to switch from "information wants to be free, we support copyleft and FOSS" to leaning so heavily on an incredibly conservative reading of IP law.
True. While IP-owning companies like Microsoft now say "it's online, so we can use it".
It's bizarre.
I'll tell you what: I'll drop my conservative stance in defense if FOSS when Windows and the latest Hollywood movie are "fair use" for consumption by whatever LLM I cook up.
If we take the point of view that LLMs are tools (I agree), then people need to be absolutely certain that these tools don't contain (compressed) representations of copyrighted works.
I've pointed out elsewhere in this thread that this is the opposite of how the real world works.
In actual fact, people who need software built hire a tool (e.g., a software developer like me) to build it for them. That tool - me or you - has inside it a tremendous library of copyrighted works represented. I've worked on enough different projects over the decades that the next CRUD function, or rule-driven data-entry tool, or whatever, that I build is going to draw very significantly from the last ones I built. And those last ones were copyrighted, with those rights held by my employer at the time, and maybe even protected by NDA or defense-style classifications.
Is your position that this is OK so long as it's stuff that I can keep in my squishy brain, but the moment that mechanism moves to silicon, it somehow becomes fundamentally different?
The other major argument I see in this thread is that for LLMs it's different because there's a third party who is aggregating the data, and selling me (or my employer) use of that tool. But this doesn't change the overall picture at all. It just adds one more layer of dereferencing into it. The addition of that middleman hasn't altered the moral landscape: how is hiring me, along with what's in my memory, different from hiring the combination of me plus a helper to supplement my memory? There's an aspect of scale, I suppose. With that helper I can achieve greater quantities, but it's not changing the story in a qualitative way.