← Back to context

Comment by Hard_Space

6 months ago

What a uninformative headline. I was going to chip in with the annoyance that a romance language like Romanian appends the article to the word, Russian-style.

Multiple definitions of a word is tricky to work around, especially when most of Wikipedia's documents are called "articles".

  • Random unprompted fun fact: Articles are the main type of "Page" on wikipedia, but not the only type! Buried deep in their docs is the full list of 'namespaces', which you need to parse their XML dumps:

      class Namespace(IntEnum):
          MEDIA = -2
          SPECIAL = -1
          ARTICLE = 0
          TALK = 1
          TEMPLATE = 10
          PORTAL = 100
          PORTAL_TALK = 101
          TEMPLATE_TALK = 11
          DRAFT = 118
          DRAFT_TALK = 119
          HELP = 12
          MOS = 126
          MOS_TALK = 127
          HELP_TALK = 13
          CATEGORY = 14
          CATEGORY_TALK = 15
          USER = 2
          USER_TALK = 3
          WIKIPEDIA = 4
          WIKIPEDIA_TALK = 5
          FILE = 6
          FILE_TALK = 7
          TIMEDTEXT = 710
          TIMEDTEXT_TALK = 711
          MEDIAWIKI = 8
          MODULE = 828
          MODULE_TALK = 829
          MEDIAWIKI_TALK = 9
    

    Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!

    ...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!

    [1] https://phabricator.wikimedia.org/

    [2] https://www.wikifunctions.org/