Comment by w10-1

1 month ago

> I do not think it is as simple as calling it theft in the same way as copying books

Aside from the incentive problem, there is a kind of theft, known as conversion: when you were granted a license under some conditions, and you went beyond them - you kept the car past your rental date, etc. In this case, the documentation is for people to read; AI using it to answer questions is a kind of conversion (no, not fair use). But these license limits are mostly implicit in the assumption that (only) people are reading, or buried in unenforceable site terms of use. So it's a squishy kind of stealing after breaching a squishy kind of contract - too fuzzy to stop incented parties.

2 comments

w10-1

jefftk 1 month ago

Why do you think there's was an implicit agreement that documentation was only intended for humans? I've written a lot of documentation, much of it open source, and I'm generally very excited that it has proved additionally useful via LLMs. If you had asked me in 2010 whether that was something I intended in writing docs I'm pretty sure I would have said something like "that's science fiction, but sure".

b112 1 month ago

You still intended it for humans. Intent is defined by what one is aiming for, and without knowledge of an alternative, that was your intent.
100% I get that you are OK with it being used by non-human ingestion. And I think many might be OK with that.
One thing, I'm not sure how helpful the documentation is. I think we're getting training out of example, not docs. This makes me think... we could test this by creating a new pseudo-language, and then provide no examples, only docs.
If the LLM can then code effectively after reading the docs, we'd have a successful test. Otherwise? It's all parroting.