Comment by LordDragonfang

9 hours ago

> You will almost certainly find large parts of it, verbatim, inside of a github repository or on an authors webpage. AI takes the credit so you don't get blamed for copyright theft.

Only if you're doing something trivial or highly common, in which case it's boilerplate that shouldn't be copyrighted. We already had this argument when Oracle sued Google over Java. We already had the "just stochastic parrots" conversation too, and concluded it's a specious argument.

2 comments

LordDragonfang

heavyset_go 9 hours ago

> We already had this argument when Oracle sued Google over Java.

"It's boilerplate therefore it isn't IP" isn't the argument that was made by Google, nor is it the argument that the case was decided upon.

It was decided that Google's use of the API met the four determining factors used by courts to ascertain whether use of IP is fair use. The court found that even though it was Oracle's copyrighted IP, it was still fair use to use it in the way Google did.

https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_...

themafia 9 hours ago

> in which case it's boilerplate that shouldn't be copyrighted

Let's say it's boilerplate code filled with comments that are designed to assist in understanding the API being written against. Are the comments somehow not covered because they were added to "boilerplate code?" Even if they're reproduced verbatim as well?

> We already had the "just stochastic parrots" conversation too

Oh, I was not part of those conversations, perhaps you can link me to them? The mere stated existence of them is somewhat underwhelming and entirely unconvincing. Particularly when it seems easy to ask an LLM to generate code and then to search for elements of that code on the Internet. With that methodology you wouldn't need to rely on conversations but on actual hard data. Do you happen to know if that is also available?