Comment by grajaganDev
2 days ago
This keeps generating new pages to keep the crawler occupied.
Looks like this would tarpit any web crawler.
2 days ago
This keeps generating new pages to keep the crawler occupied.
Looks like this would tarpit any web crawler.
It would indeed. Note the warning: "There is not currently a way to differentiate between web crawlers that are indexing sites for search purposes, vs crawlers that are training AI models. ANY SITE THIS SOFTWARE IS APPLIED TO WILL LIKELY DISAPPEAR FROM ALL SEARCH RESULTS."
Real search engines respect robots.txt so you could just tell them not to enter Markov Chain Hell.
I suspect AI crawler would also (quickly learn to) respect it also?
2 replies →
It's actually a great idea to spread malware without leaving traces too, it makes content inspection to be very difficult, view-source: to be broken and most of debugging tools, saving to .har, etc.
how is view source broken
1 reply →