← Back to context

Comment by knowitnone

3 months ago

you mean AI crawlers from Microsoft, owners of Github?

The big companies tend to respect robots.txt. The problem is other, unscrupulous actors use fake user agents and residential IPs and don't respect robots.txt or act reasonably.

I have no idea where they are from. I'd surprised if MS is using a network of 1M+ residential IP addresses, but they've surprised me before ...