Comment by Nextgrid

2 months ago

> SerpApi deceptively takes content that Google licenses from others

They have a different definition of "licensing" than most people I guess. Aren't site operators complaining about Google using this "licensed" content in AI overviews... not to mention the scraping for AI model training.

The pot is calling the kettle black.

12 comments

Nextgrid

skybrian 2 months ago

As far as I know, Google respects robots.txt and doesn't obfuscate their crawlers, so you can easily block them if you want. It seems like an important distinction?

Nextgrid 2 months ago
Google can afford to respect robots.txt because it has a monopoly on search and nobody would consider actually blocking them in said robots.txt anyway.
SerpApi doesn't have that privilege.
- skybrian 2 months ago
  
  Some domains do block Google, often partially. There are some statistics here:
  https://radar.cloudflare.com/ai-insights#ai-user-agents-foun...
- xnx 2 months ago
  
  Google has respected robots.txt from the start.
- bitpush 2 months ago
  
  but SerpApi is not scraping websites, it is sending malicoius requests to google.com.
  
  1 reply →
throw-12-16 2 months ago

robots.txt is not a legally binding document, nobody needs to actually respect it
immibis 2 months ago
There's no law that says you have to do that. It used to be a sensible thing to do, in the early internet. In the current internet, obeying robots.txt is a self-handicap and you shouldn't do it.
DDoS remains illegal regardless of robots.txt.
- skybrian 2 months ago
  
  It's rather odd to use words like "should" when you're advocating for disrespecting other people's wishes. There are sometimes reasons not to cooperate, but it seems like a good default.
  
  3 replies →