Comment by mojosam
10 hours ago
> Someone with a subscription logs into the site, then archives it.
That’s not the case. I don’t have a NYT subscription, I just Googled for an old obscure article from 1989 on pork bellies I thought would be unlikely for archive.today to have cached, and sure enough when I asked to retrieve that article, it didn’t have it and began the caching process. A few minutes later, it came up with the webpage, which if you visit on archive.is, you can see it was first cached just a few minutes ago.
https://www.nytimes.com/1989/11/01/business/futures-options-...
My assumption has been that the NYT is letting them around the paywall, much like the unrelated Wayback Machine. How else could this be working? Only way I could think it could work is that either they have access to a NYT account and are caching using that — something I suspect the NYT would notice and shutdown — or there is a documented hole in the paywall they are exploiting (but not the Wayback Machine, since the caching process shows they are pulling direct from the NYT).
No comments yet
Contribute on Hacker News ↗