← Back to context

Comment by ChocMontePy

10 hours ago

I noticed last year that some archived pages are getting altered.

Every Reddit archived page used to have a Reddit username in the top right, but then it disappeared. "Fair enough," I thought. "They want to hide their Reddit username now."

The problem is, they did it retroactively too, removing the username from past captures.

You can see on old Reddit captures where the normal archived page has no username, but when you switch the tab to the Screenshot of the archive it is still there. The screenshot is the original capture and the username has now been removed for the normal webpage version.

When I noticed it, it seemed like such a minor change, but with these latest revelations, it doesn't seem so minor anymore.

> When I noticed it, it seemed like such a minor change, but with these latest revelations, it doesn't seem so minor anymore.

That doesn't seem nefarious, though. It makes sense they wouldn't want to reveal whatever accounts they use to bypass blocks, and the logged-in account isn't really meaningful content to an archive consumer.

Now, if they were changing the content of a reddit post or comment, that would be an entirely different matter.

  • Editing what is billed as an archive defeats the purpose of an "archive".

    • > Editing what is billed as an archive defeats the purpose of an "archive".

      No, certain edits are understandable and required. Even the archive.org edits its pages (e.g. sticks banners on them and does a bunch of stuff to make them work like you'd expect).

      Even paper archives edit documents (e.g. writing sequence numbers on them, so the ordering doesn't get lost).

      Disclosing exactly what account was used to download a particular page is arguably irrelevant information, and may even compromise the work of archiving pages (e.g. if it just opens the account to getting blocked).

    • Don't be surprised by this, there are a lot more edits than you think. For example, CSS is always inlined so that pages could render the same as it was archived.

      1 reply →

    • The relevant part of the page to archive is the content of the page, not the user account that visited the page. Most sane people would consider two archives of the same page with different user accounts at the top, the same page.