← Back to context

Comment by 1vuio0pswjnm7

3 months ago

https://web.archive.org/web/20260220191245if_/https://arstec...

archive.today is very popular on HN; the opaque, shortened URLs are promoted on HN every day

I can't use archive.today. I tried but gave up. Too many hassles. I might be in the minority but I know I'm not the only one. As it happens. I have not found any site that I cannot access without it

The most important issue with archive.today though is the person running it, their past and present behaviour. It speaks for itself

Whomever it is, they have lot of info about HN users' reading habits given that archive.today URLs are so heavily promoted by HN submitters, commenters and moderators

Archive.today wants/needs EDNS subnet

"Geolocation" as a justication is ambiguous

Why a need for geolocation

Geolocation can be used for multiple purposes

"DNS performance" is only one purpose

Other purposes might offer the user no benefit, and might even be undesirable for users

As a result, some users don't send EDNS subnet. It's always been optional to send it

Even public resolvers, third party DNS services, like Cloudflare, recognise the tradeoffs for users and allow users to avoid sending it. Popular DNS software makes compiling support for EDNS subnet optional

Archive.today wants/needs EDNS subnet so bad it tries to gather it using a tracking pixel or it tries to block users who dont send it, e.g., Cloudflare users

Thus, before one even considers all the other behaviour of this website operator, some of which is mentioned in this thread, there is a huge red flag for anyone who pays attention to EDNS subnet

As with almost all websites repeated DNS lookups are not an absolute requirement for successful HTTP requests

There are some IP addresses for archive.{today,is,md,ph,li,...} that have continued to work for years

I use archive.today all the time. How do you access pages, like for instance on the economist, without it?

  • For me, all archive.* links just present an endless captcha loop. I am not using CF DNS or any proxy/VPN, but even if I do try those things, it still doesn't work.

  •    http-request set-header user-agent "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.6533.103 Mobile Safari/537.36 Lamarr" if { hdr(host) -m end economist.com }
    

    Years ago I used some other workaround that no longer works, maybe something like amp.economist.com. AMP with text-only browser was a useful workaround for many sites

    Workarounds usually don't last forever. Websites change from time to time. This one will stop working at some point

    There are some people who for various reasons cannot use archive.today

  • If dang and tomhow enforce a policy against paywalled content would garner less interest in accessing those pages via third parties. Most news gets reported by multiple outlets in general, so the same discussions would still surface.

> Whomever it is, they have lot of info about HN users' reading habits given that archive.today URLs are so heavily promoted by HN submitters, commenters and moderators

Anyone interested in the reading habits of HN users can just take a look at news.ycombinator.com ;)

> Whomever it is, they have lot of info about HN users' reading habits given that archive.today URLs are so heavily promoted by HN submitters, commenters and moderators

It's not promoted, it's just used as a paywall bypass so everyone can read the linked article.

you can change the tld of any archive.today link if .today doesn't work. for example archive.ph, archive.is, archive.md, etc

  • There's a DNS issue between Archive Today and some ISPs which causes their domains not to resolve properly, which is why some people have a lot of trouble using it.

The fact is i cant have a discussion about a paywalled article without reading it. Archive.today is popular as a paywall bypass because nobody wants HN to devolve into debate based on a headline where nobody has rtfa.

"archive.today" as used here means the collection of archive.tld domains, where .tld could be ".is", ".md", ".ph", etc.

"promoted" as used here means placing an archive.tld URL at the top of an HN thread so that many HN readers will follow it, or placing these URLs elsewhere in threads

>I can't use archive.today. I tried but gave up. Too many hassles.

What hassles have you experienced?

I use the Archive Page[0] extension which is really easy to use.

The only thing that annoys me about it is the repeated requests (starting about eight or nine months ago) to complete CAPTCHAs.

[0] https://addons.mozilla.org/en-US/firefox/addon/archive-page/

  • "The only thing that annoys me about it is the repeated requests (starting about eight or nine months ago) to complete CAPTCHAs"

    What does this annoy you

    • >What does this annoy you

      Prior to that I was rarely prompted with a CAPTCHA. Now it's every. single. time. I archive something or open an AT link.

      Why doesn't that annoy you?

      1 reply →