Comment by Animats

14 hours ago

It's time for a lawyer letter. See the Computer Fraud and Abuse Act prosecution guidelines.[1] In general, the US Justice Department will not consider any access to open servers that's not clearly an attack to be "unauthorized access". But,

"However, when authorizers later expressly revoke authorization—for example, through unambiguous written cease and desist communications that defendants receive and understand—the Department will consider defendants from that point onward not to be authorized."

So, you get a lawyer to write an "unambiguous cease and desist" letter. You have it delivered to Amazon by either registered mail or a process server, as recommended by the lawyer. Probably both, plus email.

Then you wait and see if Amazon stops.

If they don't stop, you can file a criminal complaint. That will get Amazon's attention.

[1] https://www.justice.gov/jm/jm-9-48000-computer-fraud

> Then you wait and see if Amazon stops.

That’s if the requests are actually coming from Amazon, which seems very unlikely given some of the details in the post (rotating user agents, residential IPs, seemingly not interpreting robots.txt). The Amazon bot should come from known Amazon IP ranges and respect robots.txt. An Amazon engineer confirmed it in another comment: https://news.ycombinator.com/item?id=42751729

The blog post mentions things like changing user agent strings, ignoring robots.txt, and residential IP blocks. If the only thing that matches Amazon is the “AmazonBot” User Agent string but not the IP ranges or behavior then lighting your money on fire would be just as effective as hiring a lawyer to write a letter to Amazon.

Do we need a "robots must respect robots.txt" law?

  • If we did, bot authors would comply by just changing their User-Agent to something different that’s not expressly forbidden.

    (Disallowing * isn’t usually an option since it makes you disappear from search engines).

Honestly, I figure that being on the front page of Hacker News like this is more than shame enough to get a human from the common sense department to read and respond to the email I sent politely asking them to stop scraping my git server. If I don't get a response by next Tuesday, I'm getting a lawyer to write a formal cease and desist letter.