Comment by wvenable

24 days ago

> You mean the knowledge that Claude has stolen from all of us and regurgitated into your projects without any copyright attributions?

You can't, and shouldn't be able to, copyright and hoard "knowledge".

4 comments

wvenable

themafia 24 days ago

I did not suggest that; however, the law is clear. If I use my knowledge to produce code, under a specific license, then you take that code, and reproduce it without the license, you have broken the law.

You can twist this around as much as you like but there are several studies showing that LLMs and and will happily reproduce content from their training data.

wvenable 24 days ago
> If I use my knowledge to produce code, under a specific license, then you take that code, and reproduce it without the license, you have broken the law.
Correct. But if read your code, produce a detailed specification of that code, and then give that code to another team (that has never seen your code) and they create a similar product then they haven't broken the law.
LLMs reproducing exact content from their training data is symptom of overfitting and is an error that needs correcting. Memorizing specific training data means that it is not generalizing enough.
- themafia 24 days ago
  
  > and they create a similar product then they haven't broken the law.
  That costs significantly more and involves the creation of jobs. I see this as a great outcome. There seems to be a group of people who share the opposite of my views on this matter.
  > and is an error that needs correcting
  It's been known for years. They don't seem interested in doing that or they simply aren't capable. I presume because most of the value in their service _is_ the copyright whitewashing.
  > Memorizing specific training data means that it is not generalizing enough.
  Is that like a knob they can turn or is it something much more fundamental to the technology they've staked trillions on?
  
  1 reply →