Comment by bdhcuidbebe
11 hours ago
> But how do they bypass the paywall?
I’m guessing by using a residential botnet and using existing credentials by unknowingly ”victims” by automating their browsers.
> Otherwise, the saved page would contain information about the logged-in user.
If you read this article, theres plenty of evidence they are manipulating the scraped data.
But I’m just speculating here…
But in the article they talk about manipulating users devices to do a DDOS, not scrape websites. And the user going to the archive website is probably not gonna have a subscription, and anyway I'm not sure that simply visiting archive.today will make it able to exfiltrate much information from any other third party website since cookies will not be shared.
I guess if they can control a residential botnet more extensively they would be able to do that, but it would still be very difficult to remove login information from the page, the fact that they manipulated the scraped data for totally unrelated reasons a few times proves nothing in my opinion.
They do remove the login information for their own accoubts (e.g. the one they use for LinkedIn sign-up wall). Their implementation is not perfect, though, which is how the aliases were leaked in the first place.