Comment by bayindirh
19 hours ago
Currently all AI companies argue that the content they use falls under fair use, and disregard all licenses. This means any future ones respecting these licenses needs to be whitelisted.
19 hours ago
Currently all AI companies argue that the content they use falls under fair use, and disregard all licenses. This means any future ones respecting these licenses needs to be whitelisted.
How do you know that that bot is part of those AI companies? Maybe it's my personal bot you're blocking, should I also not have (indirectly) access to the content?
No. Access to my content is a privilege I grant you. I decide how you get to access it, and via a bot that my setup confuses for an AI crawler belonging to an anti-human AI corporation is not a valid way to access it. Get off my virtual lawn.
> No. Access to my content is a privilege I grant you.
Right, I thought the conversation was about public websites on the public internet, but I think you're talking about this in the context of a private website now? I understand keeping tighter controls if you're dealing with private content you want accessible via the internet for others but not the public.
5 replies →
You can use a honest user string denoting that it's your bot. Some AI companies label their bots transparently, they show up on the logs I keep.
While I understand that you may need a personal bot to crawl or mirror a site, I can't guarantee that I'll grant you access.
I don't like to be that heavy-handed in the first place, but capitalism is making it harder to trust entities which you can't see and talk face to face.