← Back to context

Comment by Aurornis

16 hours ago

Yes, but the point is that big company crawlers aren’t paying for questionably sourced residential proxies.

If this person is seeing a lot of traffic from residential IPs then I would be shocked if it’s really Amazon. I think someone else is doing something sketchy and they put “AmazonBot” in the user agent to make victims think it’s Amazon.

You can set the user agent string to anything you want, as we all know.

I used to work for malware detection for a security company, and we looked at residential IP proxy services.

They are very, very, very expensive for the amount of data you get. You are paying for per bit of data. Even with Amazon's money, the number quickly become untenable.

It was literally cheaper for us to subscribe to business ADSL/cable/fiber optic services to our corp office buildings and thrunk them together.

I wonder if anyone has checked whether Alexa devices serve as a private proxy network for AmazonBot’s use.

  • Yes, people have probably analyzed Alexa traffic once or twice over the years.

    • You joke, but do people analyze it continuously forever also? Because if we’re being paranoid, that’s something you’d need to do in order to account for random updates that are probably happening all the time.

I worked for Microsoft doing malware detection back 10+ years ago, and questionably sourced proxies were well and truly on the table

  • >> but the point is that big company crawlers aren’t paying for questionably sourced residential proxies.

    > I worked for Microsoft doing malware detection back 10+ years ago, and questionably sourced proxies were well and truly on the table

    Big Company Crawlers using questionably sourced proxies - this seems striking. What can you share about it?

    • They worked on malware detection. The most likely reason is very obvious: if you only allow traffic from residential addresses to your Command & Control server, you make anti-malware research (which is most likely coming from either a datacenter or an office building) an awful lot harder - especially when you give non-residential IPs a different and harmless response instead of straight-up blocking them.

They could be using echo devices to proxy their traffic…

Although I’m not necessarily gonna make that accusation, because it would be pretty serious misconduct if it were true.

  • To add: it’s also kinda silly on the surface of it for Amazon to use consumer devices to hide their crawling traffic, but still leave “Amazonbot” in their UA string… it’s pretty safe to assume they’re not doing this.

> Yes, but the point is that big company crawlers aren’t paying for questionably sourced residential proxies

You'd be surprised...

  • >> Yes, but the point is that big company crawlers aren’t paying for questionably sourced residential proxies

    > You'd be surprised...

    Surprised by what? What do you know?