← Back to context

Comment by wat10000

5 days ago

I've been online since before the web existed, and this is the first time I've ever seen this idea of some implicit obligation to give people advance notice before you deploy a crawler. Looks to me like people are making up new rules on the fly because they don't like Apple and/or LLMs.

I stand by what I said.

Apple are saying you can opt out of their training data collection using robots.txt.

But... they collected their training data before they told people how to opt out.

I don't understand why me pointing that out as "eyebrow raising" is controversial here.

  • It's not controversial, it's just not how the ecosystem works. There has never been an expectation that someone make a notification about impending crawling.

    It might be nice if there were categories that well-behaved bots could follow, as noted above, but even then the problem exists for bots doing new things that don't fall into existing categories.

    • My complaint here isn't what they did. It's that they explain it as "here's how to opt out" when the information was too late to allow people to opt out.

      I think that's disingenuous of them.

      2 replies →