Comment by Animats
14 hours ago
It's time for a lawyer letter. See the Computer Fraud and Abuse Act prosecution guidelines.[1] In general, the US Justice Department will not consider any access to open servers that's not clearly an attack to be "unauthorized access". But,
"However, when authorizers later expressly revoke authorization—for example, through unambiguous written cease and desist communications that defendants receive and understand—the Department will consider defendants from that point onward not to be authorized."
So, you get a lawyer to write an "unambiguous cease and desist" letter. You have it delivered to Amazon by either registered mail or a process server, as recommended by the lawyer. Probably both, plus email.
Then you wait and see if Amazon stops.
If they don't stop, you can file a criminal complaint. That will get Amazon's attention.
> Then you wait and see if Amazon stops.
That’s if the requests are actually coming from Amazon, which seems very unlikely given some of the details in the post (rotating user agents, residential IPs, seemingly not interpreting robots.txt). The Amazon bot should come from known Amazon IP ranges and respect robots.txt. An Amazon engineer confirmed it in another comment: https://news.ycombinator.com/item?id=42751729
The blog post mentions things like changing user agent strings, ignoring robots.txt, and residential IP blocks. If the only thing that matches Amazon is the “AmazonBot” User Agent string but not the IP ranges or behavior then lighting your money on fire would be just as effective as hiring a lawyer to write a letter to Amazon.
I wonder how the author hasn't reached this conclusion. The official Amazon Crawler docs literally tell you how to distinguish between legit Amazonbots and malicious copycats via DNS lookup: https://developer.amazon.com/amazonbot
Do we need a "robots must respect robots.txt" law?
If we did, bot authors would comply by just changing their User-Agent to something different that’s not expressly forbidden.
(Disallowing * isn’t usually an option since it makes you disappear from search engines).
Honestly, I figure that being on the front page of Hacker News like this is more than shame enough to get a human from the common sense department to read and respond to the email I sent politely asking them to stop scraping my git server. If I don't get a response by next Tuesday, I'm getting a lawyer to write a formal cease and desist letter.
Someone from Amazon already responded: https://news.ycombinator.com/item?id=42751729
> If I don't get a response by next Tuesday, I'm getting a lawyer to write a formal cease and desist letter.
Given the details, I wouldn’t waste your money on lawyers unless you have some information other than the user agent string.
It's computer science, nothing changes on corpo side until they get a lawyer letter.
And even then, it's probably not going to be easy
No one gives a fuck in this industry until someone turns up with bigger lawyers. This is behaviour which is written off with no ethical concerns as ok until that bigger fish comes along.
Really puts me off it.
Lol you really think an ephemeral HN ranking will make change?
It's not unheard of. But neither would I count on it.
It did yesterday!
https://news.ycombinator.com/item?id=42740516
There's only one way to find out!