← Back to context Comment by knowitnone 3 months ago you mean AI crawlers from Microsoft, owners of Github? 5 comments knowitnone Reply haiku2077 3 months ago The big companies tend to respect robots.txt. The problem is other, unscrupulous actors use fake user agents and residential IPs and don't respect robots.txt or act reasonably. internetter 3 months ago Big companies have thrown robots.txt to the wind when it comes to their precious AI models. sph 3 months ago Yeah, they have openly disregarded copyright law, it's not a puny robots.txt file that's gonna stop them. 1 reply → PaulDavisThe1st 3 months ago I have no idea where they are from. I'd surprised if MS is using a network of 1M+ residential IP addresses, but they've surprised me before ...
haiku2077 3 months ago The big companies tend to respect robots.txt. The problem is other, unscrupulous actors use fake user agents and residential IPs and don't respect robots.txt or act reasonably. internetter 3 months ago Big companies have thrown robots.txt to the wind when it comes to their precious AI models. sph 3 months ago Yeah, they have openly disregarded copyright law, it's not a puny robots.txt file that's gonna stop them. 1 reply →
internetter 3 months ago Big companies have thrown robots.txt to the wind when it comes to their precious AI models. sph 3 months ago Yeah, they have openly disregarded copyright law, it's not a puny robots.txt file that's gonna stop them. 1 reply →
sph 3 months ago Yeah, they have openly disregarded copyright law, it's not a puny robots.txt file that's gonna stop them. 1 reply →
PaulDavisThe1st 3 months ago I have no idea where they are from. I'd surprised if MS is using a network of 1M+ residential IP addresses, but they've surprised me before ...
The big companies tend to respect robots.txt. The problem is other, unscrupulous actors use fake user agents and residential IPs and don't respect robots.txt or act reasonably.
Big companies have thrown robots.txt to the wind when it comes to their precious AI models.
Yeah, they have openly disregarded copyright law, it's not a puny robots.txt file that's gonna stop them.
1 reply →
I have no idea where they are from. I'd surprised if MS is using a network of 1M+ residential IP addresses, but they've surprised me before ...