← Back to context

Comment by bbor

6 months ago

Random unprompted fun fact: Articles are the main type of "Page" on wikipedia, but not the only type! Buried deep in their docs is the full list of 'namespaces', which you need to parse their XML dumps:

  class Namespace(IntEnum):
      MEDIA = -2
      SPECIAL = -1
      ARTICLE = 0
      TALK = 1
      TEMPLATE = 10
      PORTAL = 100
      PORTAL_TALK = 101
      TEMPLATE_TALK = 11
      DRAFT = 118
      DRAFT_TALK = 119
      HELP = 12
      MOS = 126
      MOS_TALK = 127
      HELP_TALK = 13
      CATEGORY = 14
      CATEGORY_TALK = 15
      USER = 2
      USER_TALK = 3
      WIKIPEDIA = 4
      WIKIPEDIA_TALK = 5
      FILE = 6
      FILE_TALK = 7
      TIMEDTEXT = 710
      TIMEDTEXT_TALK = 711
      MEDIAWIKI = 8
      MODULE = 828
      MODULE_TALK = 829
      MEDIAWIKI_TALK = 9

Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!

...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!

[1] https://phabricator.wikimedia.org/

[2] https://www.wikifunctions.org/