Comment by zozbot234
8 hours ago
Wikipedia is not a pure hosting operation, it's trying to foster a worldwide community-of-practice of volunteer contributors that can be sustainable in the long term, and that does take quite a bit of spending. I have no idea why so many people keep getting this wrong.
The Wikimedia Foundation is a full-fledged cloud services provider. They host applications and developers on their cloud platform. These developers have been working with AI and scripted solutions for a long, long time. ClueBot is the premier example of an AI- (ML)- powered solution to combat vandalism.
So Wikipedia is not merely a "cloud app with cloud storage" but it is a first-class cloud-based platform: the English project is merely the largest and best-known, but there are hundreds, hundreds of other projects hosted on WMF's cloud services. And the developers and the bot operators who run in the backend are hardly detectable by the end-users or even the everyday editors, but they are also the backbone of WMF services, and they are supported by WMF admins and developers, to run their applications that support editors and wiki admins in their duties.
> "I have no idea why so many people keep getting this wrong."
To me it seems a perfectly natural effect of nearly everyone using it as a website which holds lots of information, and very few people comparatively have any experience with the community side, so people assume that what they see is what Wikipedia is.
Not many people are spending time reading reports on organisation costs breakdowns for Wikipedia, so the only way they'd know is if someone like you actively tells them. I personally also assumed server costs were the vast majority, with legal costs a probable distant second - but your comment has inspired me to actually go and look for a breakdown of their spending, so thanks.
Edit: FY24-25, "infrastructure" was just 49.2% of their budget - from https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_...
Wikipedia is also uniquely cacheable.
I suspect that 95+% of visits to Wikipedia don't actually require them to run any PHP code, but are instead just served from some cache, as each Wikipedia user viewing a given article (if they're not logged in) sees basically the same thing.
This is in contrast to E.G. a social network, which needs to calculate timelines per user. Even if there's no machine learning and your algorithm is "most recent posts first", there's still plenty of computation involved. Mastodon is a good example here.
The move away from "most recent posts first" is because that's actually harder at scale than the algorithmic timeline.
As a former Wikipedia admin, I think the best way to think of it as a massive text-focused battle MMORPG that happens to produce an encyclopedia as a side effect.
Yep, the encyclopedia is the not-so-wasteful "proof of work" part of the MMORPG. It's a game, but you grind it by working on generally useful stuff.
Haha and with battles in the form of massive flame wars?
> holds lots of information
But they want that information to be at least kept up to date and hopefully to improve over time, right? That's what the community is for. It's not a free lunch.