Comment by dhx

5 hours ago

Nothing ready-to-go that I'm aware of. ATP will just observe in the next weekly crawl that a shop is no longer returned by the storefinder API call or sitemap crawl, and that shop will simply not be present in the next weekly dataset generated.

To set up archives of shop-specific pages (e.g. record of opening hours, address, etc at a point in time), one could monitor the latest builds of https://alltheplaces.xyz/builds.html and when a new build completes, take the new build and 2nd oldest build to compare differences. Then for any feature whose attributes have changed (address, phone number, opening hours, etc) archive the `website` and/or `source_uri` attribute pages again to ensure the latest snapshot is captured. Any new feature would get the same treatment so the page for the newly observed shop/feature is archived for the first time.

I'm also aware ArchiveTeam projects tend to commence once the impending collapse of a retail chain is known and someone realises there is a website not archived which would be useful to preserve. Monitoring of ATP feature counts for brands across time may give some hint of how a brand is performing and whether it is growing or shrinking without having to find press releases and financial statements of the brand. Even if a brand suddenly announces bankruptcy (it happens all the time), generally the website will remain online for at least a few months whilst a new buyer is sought or whilst each retail location has a fire sale to get rid of remaining merchandise. It's also worthwhile to be aware of acquisitions of retail chains as this often results in the new parent company changing websites soon after acquisition closes, possibly removing useful content that once existed. Websites also change "just because" and this could be observed after-the-fact by seeing when ATP spiders break and get replaced/fixed.

0 comments

dhx

No comments yet

Contribute on Hacker News ↗