Comment by coldpie
10 hours ago
> allowing Googlebot to crawl my sites
As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.
10 hours ago
> allowing Googlebot to crawl my sites
As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.
> They have no obligation to respect your wishes
I have no obligation to not send all scraper-looking traffic to a black hole full of zip bombs.
There's always poison fountain - deliberately wrong source code.
You do have an obligation because what you are describing is illegal, at least in the US under the CFAA.
Okay, nix the zip bombs. What's my obligation to treat bot-shaped traffic as something I should reply to?
Spreading malware to your website's visitors is wild and illegal in most jurisdictions. I certainly wouldn't confess about it online.
Malware? It's just a large file. A very, very large file.
But fine. How about I just...don't respond to those requests at all. I have no obligation to send them data period.
Is AI a visitor or malware? It certainly steals paid resources (bandwidth).
Disclaimer: his website is for hosting malware for "testing" purposes. Testing how well AI can't deal with it.
except google does respect robots.txt so you do have a choice?
still respects robots.txt