Comment by neoromantique

4 hours ago

Ask HN: How does one archive websites like this without being a d-ck?

I want to save this for offline use, but I think recursive wget is a bit poor manners, is there established way one should approach it, get it from archive somehow?

3 comments

neoromantique

OJFord 3 hours ago

A single user's one-off recursive wget seems fine? Browsers also support it iirc, individual pages at very least (and saved to the same place, the links will work).

No doubt it's already in many archive sites though, you could just fetch from them instead of the original?

neoromantique 3 hours ago

I ask in more general sense, if there is a way to fetch this stuff directly from webarchive or something along those lines.
Gotta hit the search I feel :)

ssl-3 2 hours ago

In the old-web days, I just used wget with slow pacing (and by "pacing" I mean: I don't need it to be done today or even this week, so if it takes a rather long time then that's fine. Slow helped keep me from mucking up the latency on my dial-up connection, too.)

I don't think that's being a dick for old-web sites that still exist today. Most of the information is text, the photos tend to be small, it's all generally static (ie, light-weight to serve), and the implicit intent is for people to use it.

But it's pretty slow-moving, so getting it from archive.org would probably suffice if being zero-impact is the goal.

(Or, you know: Just email the dude that runs it like it's 1998 again, say hi, and ask. In this particular instance, it's still being maintained.)