Comment by neilv

1 year ago

Can demonstrable ignoring of robots.txt help the cases of copyright infringement lawsuits against the "AI" companies, their partners, and customers?

19 comments

neilv

thayne 1 year ago

Probably not copyright infringement. But it is probably (hopefully?) a violation of CFAA, both because it is effectively DDoSing you, and they are ignoring robots.txt.

Maybe worth contacting law enforcement?

Although it might not actually be Amazon.

to11mtm 1 year ago
Big thing worth asking here. Depending on what 'amazon' means here (i.e. known to be Amazon specific IPs vs Cloud IPs) it could just be someone running a crawler on AWS.
Or, folks failing the 'shared security model' of AWS and their stuff is compromised with botnets running on AWS.
Or, folks that are quasi-spoofing 'AmazonBot' because they think it will have a better not-block rate than anonymous or other requests...
- thayne 1 year ago
  
  From the information in the post, it sounds like the last one to me. That is, someone else spoofing an Amazonbot user agent. But it could potentially be all three.

adastra22 1 year ago

On what legal basis?

flir 1 year ago
In the UK, the Computer Misuse Act applies if:
* There is knowledge that the intended access was unauthorised
* There is an intention to secure access to any program or data held in a computer
I imagine US law has similar definitions of unauthorized access?
`robots.txt` is the universal standard for defining what is unauthorised access for bots. No programmer could argue they aren't aware of this, and ignoring it, for me personally, is enough to show knowledge that the intended access was unauthorised. Is that enough for a court? Not a goddamn clue. Maybe we need to find out.
- pests 1 year ago
  
  > `robots.txt` is the universal standard
  Quite the assumption, you just upset a bunch of alien species.
  
  5 replies →
- adastra22 1 year ago
  
  robots.txt isn't a standard. It is a suggestion, and not legally binding AFAIK. In US law at least a bot scraping a site doesn't involve a human being and therefore the TOS do not constitute a contract. According to the Robotstxt organization itself: “There is no law stating that /robots.txt must be obeyed, nor does it constitute a binding contract between site owner and user, but having a /robots.txt can be relevant in legal cases.”
  The last part basically means the robots.txt file can be circumstantial evidence of intent, but there needs to be other factors at the heart of the case.
readyplayernull 1 year ago
Terms of use contract violation?
- hipadev23 1 year ago
  
  Robots.txt is completely irrelevant. TOU/TOS are also irrelevant unless you restrict access to only those who have agreed to terms.
  
  3 replies →
- bdangubic 1 year ago
  
  good thought but zippy chance this holds up in Court
tepidsaucer 1 year ago

I wind up in jail for ten years if I download an episode of iCarly; Sam Altman inhales every last byte on the internet and gets a ticker tape parade. Make it make sense.