Comment by skybrian

2 months ago

As far as I know, Google respects robots.txt and doesn't obfuscate their crawlers, so you can easily block them if you want. It seems like an important distinction?

Google can afford to respect robots.txt because it has a monopoly on search and nobody would consider actually blocking them in said robots.txt anyway.

SerpApi doesn't have that privilege.

There's no law that says you have to do that. It used to be a sensible thing to do, in the early internet. In the current internet, obeying robots.txt is a self-handicap and you shouldn't do it.

DDoS remains illegal regardless of robots.txt.

  • It's rather odd to use words like "should" when you're advocating for disrespecting other people's wishes. There are sometimes reasons not to cooperate, but it seems like a good default.

    • The web is now hostile. If you're starting a search engine, everyone else has written a robots.txt that bans you from starting a search engine. You either ignore that, or you abandon your plan to make a search engine.

      2 replies →