I could've sworn I remembered such a post, thank you so much for vindicating my hunch! At the time I figured there wasn't much harm in it, but in hindsight it's obvious that the absurd number of translations was the just smoke stemming from a self-promotion fire.
Props to whatever HackerNewsian (YCombinist?) took the time to chase all this down and do this fascinating writeup! You will be remembered in /r/TodayILearned posts every few months for many decades to come, no doubt.
I see the defense on that context that admins aren't really mods when practically speaking they do act like mods by closing discussions - in theory this is when "Wikipedia has reached an opinion". In practice it is very easy for it to be when it has reached their opinion.
I thought this was referring to articles as in the part of speech (i.e. there are nouns, verbs, but also article like “a” or “the”) given the title and something spanning across languages… I wonder what his exact thought process was that motivated all that effort?
This is not interesting than the title initially suggests. It’s not merely a curiosity, but an investigation:
> I discovered what I think might have been the single largest self-promotion operation in Wikipedia’s history, spanning over a decade and covering as many as 200 accounts and even more proxy IP addresses.
Quite the contrary, the story is rather fascinating. (Or did you mean to say "more interesting"?)
If you want even more gruesome details, the story of how this all unraveled plus all sorts of info about Woodard, a positively creepy while supremacist, can be found on the English article's talk page:
And with this anomaly removed, the list of articles in the most languages is back to what you'd expect: the top 10 is all large countries and Wikipedia itself.
I did, yes, that was a typo. I did notice it after the edit window was closed but the submission hadn’t had any traction so it felt silly to reply to my own comment to correct it.
Glad the submission was resurrected, I think it deserves it. My original comment was precisely to convince people to give it a read.
So they only got caught because they were too efficient in their scheme and rose to number 1 in translations. How many more schemes go unnoticed? Not saying Wikipedia is not doing a great job, just saying that there is probably a lot of such schemes and that it seems nearly impossible to stop them all. It’s sad that a lot of people don’t want the truth to be available, at least when it concerns themselves, they want you to only know what they think you should, like on their Instagram.
...and I would have gotten away with it if it weren't for you meddling kids!
I find it interesting that the whole scheme might not have been noticed had he been more modest and not tried to translate the pages into rare languages. We don't know the motive, but if it was self-promotion, these additional languages were presumably of negligible value yet risked the scheme.
On the contrary, it's precisely by "risking" the scheme that the self-promotion became effective.
It's quite unlikely for anybody to stumble upon any given English-language Wikipedia article by chance, given that there's literally billions of them now - therefore, the promotional value of having a Wikipedia article on something even in a popular language is negligible. However, by spamming all the Wikipedias, and having this "scheme" discovered, Woodard created a situation where he is widely reported on as the artist that spammed Wikipedia, and has therefore received the five minutes of fame that he so desperately wanted.
If he had stuck to spamming the English Wikipedia, would he have ended up on the frontpage of HN?
Quietly having all these articles might be personally satisfying in some way, but his obvious appetite for fame or notoriety points toward him wanting the scheme to be exposed. In fact I would not be entirely surprised if he somehow instigated the discovery of his activities.
This may be a "well, of course it's that way" observation to some, but: the article on X in wikipedia is typically quite different in one language than another. So you can get interesting insights by reading about X in different languages.
For example, the French article about David Hockney has a lovely Francophone twist in that the first few lines point out that he lived in Normandy for a few years, whereas Emglish Wikipedia buries the fact deep in the page. The page for VLC has a photo of the lead dev in the French page but no discussion of the plugin architecture. And so on. It doesn't seem unreasonable to me to assume that the pages in some languages might be particularly strong if the topic plays a bigger role in the culture than in the English-speaking world.
It's also interesting to see what decisions editors have made about animals. In English, for example, the article for the African elephant[1] is just the animal's name.
In Italian, Spanish, and Tagalog it's the scientific name of the animal.
This makes sense in languages (like Spanish) where an animal may have a lot of different names depending on the country, region, or dialect. If you look at the article for Pig[2], you'll see at least fifteen names listed.
I have great respect for and am impressed by the work that has been done. I also appreciate the explanations in this article.
One question remains (perhaps related to my limited knowledge of Wikipedia’s processes): why is there no reference to this work on Woodard’s page?
"Original research" is a cardinal sin on Wikipedia, meaning it's not eligible for inclusion in Wikipedia unless news outlets outside Wikipedia pick up the story and start publishing stories about it.
I’ve always thought that the criteria for inclusion on Wikipedia should simply be: is it true and is it verifiable. All the other criteria, notoriety, no original research, etc. really shouldn’t matter.
We could probably add it to multilingual, training sets for A.I..
Previously, the ones trained on a thousand or more languages by Meta and Wycliffe used the Bible since it's the only complex, rich message translated to most, human languages. Which God said would happen to His authentic message. :)
What a coincidence. Just yesterday i watched a youtube video about Corbin Bleu being the 3rd most translated article on wikipedia after Jesus and Barack Obama. Not surprised to see that it was a one user effort once again
Oh boy, that picture of Woodard...I've never seen anyone who would look quite so comfortable in an SS uniform. Maybe Stephen Miller. Elon Musk looks like an imperial admiral from Star Wars being choked by Darth Vader's invisible hand. Once you see it, you won't be able to un-see it.
If you let astroturfing happen on Wikipedia grounds it'll become a piece of useless crap just like the much of the rest of Internet. If you read the report you'll learn that the promoters weren't content just with their own entry but tried to sneak in references into unrelated popular articles.
Yup. From the report: On the English Wikipedia alone, Woodard’s name was inserted into no fewer than 93 articles, including Pliers, Brown pelican and Bundesautobahn 38.
For some subjects, it's appropriate to host multiple versions of articles written natively in different languages.
But for other subjects, for example science and mathematics, it does a huge disservice to non-English readers: it means that their Wikipedia is second-rate, or worse.
Wikipedia should, in science, mathematics, and other subjects that do not have cultural inflection, use machine translation so that all articles in all languages are translations of the same underlying semantic content.
It would still be written by humans. But ML / LLMs would be involved in the editing pipeline so that people lacking a common language can edit the same text.
This is the biggest mistake Wikipedia's made IMO: it privileges English readers since the English content is highest quality in most areas that are not culturally specific, and I do not think that it's an organization that wants to privilege English readers.
Users can already translate English Wikipedia articles to other languages on the fly with Chrome etc. However, the quality of the translation is just not up to scratch yet, particularly for languages that are radically different from English; just try reading some ML-translated Japanese or Chinese Wikipedia articles.
I'm not sure why you're thinking that; perhaps you're remembering something from a few years ago, or perhaps you have a prior biased against ML/LLM solutions. The translations provided by Google Chrome of Chinese wikipedia pages seem great. For example, Linear Algebra
Science and Mathematics have no cultural inflection? Do you speak more than one language? Each language has its standard sentences structures when it comes to these disciplines, and auto translators are very much not up to the task.
I prefee my Wikipedia to remain 100% human generated quality information over garbage AI slop content, which is already abundant enough on the internet.
I understand you'd like to believe that. But it looks like you're simply wrong. For example, here is the translation produced by Google Chrome of the Chinese version of the page on Bezout's Theorem.
It reads "Bézout's theorem is a theorem in algebraic geometry that describes the number of intersections of two algebraic curves . The theorem states that the number of intersections of two coprime curves X and Y is equal to the product of their degrees."
which is perfectly good English. The problem is that that is the entire page! It is thus woefully inadequate in comparison to the English page:
What a uninformative headline. I was going to chip in with the annoyance that a romance language like Romanian appends the article to the word, Russian-style.
Random unprompted fun fact: Articles are the main type of "Page" on wikipedia, but not the only type! Buried deep in their docs is the full list of 'namespaces', which you need to parse their XML dumps:
Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!
...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!
I find the hubris of this article absolutely disheartening, and toxic, and it frankly just reinforces how Wikipedia isn't a good place, and people who shouldn't have control over it have control over it.
And it isn't because of the self promoting described, but because of the response to it.
Apart from the fact that this was pure self-promotion, it was also spamming the Wikipedias of small language communities with low-effort autotranslated garbage, which I think is rather insulting.
I don't understand why somebody didn't fork the Wikipedia and build the version where you can self promote. It kinda sucks that you are not allowed to claim and edit your Wikipedia page.
After a full month of coordinated, decentralised action, the number of articles about Mr. Woodard was reduced from 335 articles to 20. A full decade of dedicated self-promotion by an individual network has been undone in only a few weeks by our community.
That is a most improper suggestion on this here orange website. It is established etiquette to _imagine what the content of the article might be_, based on the title, and then comment on that, preferably angrily. At _absolute most_ one can read the first paragraph.
This came up on HN a few months ago, when someone posted a list of most-translated articles and Woodard was at the top: https://it.wikipedia.org/wiki/Wikipedia:Pagine_da_cancellare...
I could've sworn I remembered such a post, thank you so much for vindicating my hunch! At the time I figured there wasn't much harm in it, but in hindsight it's obvious that the absurd number of translations was the just smoke stemming from a self-promotion fire.
Props to whatever HackerNewsian (YCombinist?) took the time to chase all this down and do this fascinating writeup! You will be remembered in /r/TodayILearned posts every few months for many decades to come, no doubt.
I see the defense on that context that admins aren't really mods when practically speaking they do act like mods by closing discussions - in theory this is when "Wikipedia has reached an opinion". In practice it is very easy for it to be when it has reached their opinion.
I thought this was referring to articles as in the part of speech (i.e. there are nouns, verbs, but also article like “a” or “the”) given the title and something spanning across languages… I wonder what his exact thought process was that motivated all that effort?
that was my expectation as well, because mosyt languages dont have a concept of articles
According to one count, 32% of languages don't have articles (although only based on 620 languages. 198 / 620).
https://wals.info/chapter/37
1 reply →
This is not interesting than the title initially suggests. It’s not merely a curiosity, but an investigation:
> I discovered what I think might have been the single largest self-promotion operation in Wikipedia’s history, spanning over a decade and covering as many as 200 accounts and even more proxy IP addresses.
Quite the contrary, the story is rather fascinating. (Or did you mean to say "more interesting"?)
If you want even more gruesome details, the story of how this all unraveled plus all sorts of info about Woodard, a positively creepy while supremacist, can be found on the English article's talk page:
https://en.wikipedia.org/wiki/Talk:David_Woodard/Archive_1
https://en.wikipedia.org/wiki/Talk:David_Woodard
And with this anomaly removed, the list of articles in the most languages is back to what you'd expect: the top 10 is all large countries and Wikipedia itself.
https://en.wikipedia.org/w/index.php?title=Wikipedia:Wikiped...
> Or did you mean to say "more interesting"?
I did, yes, that was a typo. I did notice it after the edit window was closed but the submission hadn’t had any traction so it felt silly to reply to my own comment to correct it.
Glad the submission was resurrected, I think it deserves it. My original comment was precisely to convince people to give it a read.
Some of these are still quite suspicious imo. "True Jesus Church", a church of a few million people ranking above Jesus?
1 reply →
Though, if you restrict to just people, then, surprisingly, Corbin Bleu is #20 .
1 reply →
So they only got caught because they were too efficient in their scheme and rose to number 1 in translations. How many more schemes go unnoticed? Not saying Wikipedia is not doing a great job, just saying that there is probably a lot of such schemes and that it seems nearly impossible to stop them all. It’s sad that a lot of people don’t want the truth to be available, at least when it concerns themselves, they want you to only know what they think you should, like on their Instagram.
So, should the David Woodard article have a section about this?
Wikipedia is not a reliable source for a Wikipedia article, so someone would have to create a news article talking about it.
Thus completing the Wikipedia human centipede.
See https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Self...
...and I would have gotten away with it if it weren't for you meddling kids!
I find it interesting that the whole scheme might not have been noticed had he been more modest and not tried to translate the pages into rare languages. We don't know the motive, but if it was self-promotion, these additional languages were presumably of negligible value yet risked the scheme.
On the contrary, it's precisely by "risking" the scheme that the self-promotion became effective.
It's quite unlikely for anybody to stumble upon any given English-language Wikipedia article by chance, given that there's literally billions of them now - therefore, the promotional value of having a Wikipedia article on something even in a popular language is negligible. However, by spamming all the Wikipedias, and having this "scheme" discovered, Woodard created a situation where he is widely reported on as the artist that spammed Wikipedia, and has therefore received the five minutes of fame that he so desperately wanted.
If he had stuck to spamming the English Wikipedia, would he have ended up on the frontpage of HN?
This was clearly the endgame all along.
Quietly having all these articles might be personally satisfying in some way, but his obvious appetite for fame or notoriety points toward him wanting the scheme to be exposed. In fact I would not be entirely surprised if he somehow instigated the discovery of his activities.
Ironically now this person has become notorious for Wiki-pollution. Since he's an "artist", he can claim it was an art project.
Sadly because it's 2025, he has a lot of competition for the award of "most insufferable douchebag".
This may be a "well, of course it's that way" observation to some, but: the article on X in wikipedia is typically quite different in one language than another. So you can get interesting insights by reading about X in different languages.
For example, the French article about David Hockney has a lovely Francophone twist in that the first few lines point out that he lived in Normandy for a few years, whereas Emglish Wikipedia buries the fact deep in the page. The page for VLC has a photo of the lead dev in the French page but no discussion of the plugin architecture. And so on. It doesn't seem unreasonable to me to assume that the pages in some languages might be particularly strong if the topic plays a bigger role in the culture than in the English-speaking world.
It's also interesting to see what decisions editors have made about animals. In English, for example, the article for the African elephant[1] is just the animal's name.
In Italian, Spanish, and Tagalog it's the scientific name of the animal.
This makes sense in languages (like Spanish) where an animal may have a lot of different names depending on the country, region, or dialect. If you look at the article for Pig[2], you'll see at least fifteen names listed.
[1] https://en.wikipedia.org/wiki/African_bush_elephant [2] https://es.wikipedia.org/wiki/Sus_scrofa_domestica
I have great respect for and am impressed by the work that has been done. I also appreciate the explanations in this article. One question remains (perhaps related to my limited knowledge of Wikipedia’s processes): why is there no reference to this work on Woodard’s page?
"Original research" is a cardinal sin on Wikipedia, meaning it's not eligible for inclusion in Wikipedia unless news outlets outside Wikipedia pick up the story and start publishing stories about it.
I’ve always thought that the criteria for inclusion on Wikipedia should simply be: is it true and is it verifiable. All the other criteria, notoriety, no original research, etc. really shouldn’t matter.
1 reply →
Shameless plug about a little game I wrote a few years ago, about guessing which pages exist in the most languages in wikipedia:
https://wikilingua.charlespierre.fr/
Fascinating! A detective story for our age.
We could probably add it to multilingual, training sets for A.I..
Previously, the ones trained on a thousand or more languages by Meta and Wycliffe used the Bible since it's the only complex, rich message translated to most, human languages. Which God said would happen to His authentic message. :)
https://interestingengineering.com/innovation/meta-used-bibl...
What a coincidence. Just yesterday i watched a youtube video about Corbin Bleu being the 3rd most translated article on wikipedia after Jesus and Barack Obama. Not surprised to see that it was a one user effort once again
[0] https://youtu.be/vJ_pEP3fRvM
Oh boy, that picture of Woodard...I've never seen anyone who would look quite so comfortable in an SS uniform. Maybe Stephen Miller. Elon Musk looks like an imperial admiral from Star Wars being choked by Darth Vader's invisible hand. Once you see it, you won't be able to un-see it.
[flagged]
Honestly, what kind of harm was it?
If you let astroturfing happen on Wikipedia grounds it'll become a piece of useless crap just like the much of the rest of Internet. If you read the report you'll learn that the promoters weren't content just with their own entry but tried to sneak in references into unrelated popular articles.
Yup. From the report: On the English Wikipedia alone, Woodard’s name was inserted into no fewer than 93 articles, including Pliers, Brown pelican and Bundesautobahn 38.
4 replies →
Reminds me of those "Edit Wikipedia as homework" college assignments.
For some subjects, it's appropriate to host multiple versions of articles written natively in different languages.
But for other subjects, for example science and mathematics, it does a huge disservice to non-English readers: it means that their Wikipedia is second-rate, or worse.
Wikipedia should, in science, mathematics, and other subjects that do not have cultural inflection, use machine translation so that all articles in all languages are translations of the same underlying semantic content.
It would still be written by humans. But ML / LLMs would be involved in the editing pipeline so that people lacking a common language can edit the same text.
This is the biggest mistake Wikipedia's made IMO: it privileges English readers since the English content is highest quality in most areas that are not culturally specific, and I do not think that it's an organization that wants to privilege English readers.
Users can already translate English Wikipedia articles to other languages on the fly with Chrome etc. However, the quality of the translation is just not up to scratch yet, particularly for languages that are radically different from English; just try reading some ML-translated Japanese or Chinese Wikipedia articles.
I'm not sure why you're thinking that; perhaps you're remembering something from a few years ago, or perhaps you have a prior biased against ML/LLM solutions. The translations provided by Google Chrome of Chinese wikipedia pages seem great. For example, Linear Algebra
https://zh.wikipedia.org/wiki/%E7%BA%BF%E6%80%A7%E4%BB%A3%E6...
The problem, as I pointed out, is that readers of non-English pages are accessing a second-rate Wikipedia. For example: https://zh.wikipedia.org/wiki/%E8%B2%9D%E7%A5%96%E5%AE%9A%E7...
Science and Mathematics have no cultural inflection? Do you speak more than one language? Each language has its standard sentences structures when it comes to these disciplines, and auto translators are very much not up to the task.
I prefee my Wikipedia to remain 100% human generated quality information over garbage AI slop content, which is already abundant enough on the internet.
I understand you'd like to believe that. But it looks like you're simply wrong. For example, here is the translation produced by Google Chrome of the Chinese version of the page on Bezout's Theorem.
https://zh.wikipedia.org/wiki/%E8%B2%9D%E7%A5%96%E5%AE%9A%E7...
It reads "Bézout's theorem is a theorem in algebraic geometry that describes the number of intersections of two algebraic curves . The theorem states that the number of intersections of two coprime curves X and Y is equal to the product of their degrees."
which is perfectly good English. The problem is that that is the entire page! It is thus woefully inadequate in comparison to the English page:
https://en.wikipedia.org/wiki/B%C3%A9zout%27s_theorem
> it means that their Wikipedia is second-rate, or worse.
?
You shouldn't respond lazily like that.
The point is that the quality of a wikipedia page is positively correlated with the number of editors working on it.
2 replies →
What a uninformative headline. I was going to chip in with the annoyance that a romance language like Romanian appends the article to the word, Russian-style.
Multiple definitions of a word is tricky to work around, especially when most of Wikipedia's documents are called "articles".
Random unprompted fun fact: Articles are the main type of "Page" on wikipedia, but not the only type! Buried deep in their docs is the full list of 'namespaces', which you need to parse their XML dumps:
Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!
...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!
[1] https://phabricator.wikimedia.org/
[2] https://www.wikifunctions.org/
1 reply →
I find the hubris of this article absolutely disheartening, and toxic, and it frankly just reinforces how Wikipedia isn't a good place, and people who shouldn't have control over it have control over it.
And it isn't because of the self promoting described, but because of the response to it.
Deletionists are evil.
Could you expand on why you feel that this series of deletions is wrong?
Apart from the fact that this was pure self-promotion, it was also spamming the Wikipedias of small language communities with low-effort autotranslated garbage, which I think is rather insulting.
Care to explain what was bad about the response?
I don't understand why somebody didn't fork the Wikipedia and build the version where you can self promote. It kinda sucks that you are not allowed to claim and edit your Wikipedia page.
They did. It's called 'DNS' and you can set up a 'page' about yourself if you want.
It doesn’t kinda suck.
Wikipedia is supposed to be an encyclopaedia, which means it’s intended to come with some expectation of neutrality.
If you could edit your own page, do you really think it’d stay as factual and as neutral as possible?
Just make yourself a website.
I'm sure someone did.
Facebook and Instagram are too cheesy, I want something more genuine.
6 replies →
This only offers me 19 languages: https://en.wikipedia.org/wiki/David_Woodard The article claims that it has 335
Explained at the end of the article:
After a full month of coordinated, decentralised action, the number of articles about Mr. Woodard was reduced from 335 articles to 20. A full decade of dedicated self-promotion by an individual network has been undone in only a few weeks by our community.
It’s a good idea to read whatever you’re commenting on
That is a most improper suggestion on this here orange website. It is established etiquette to _imagine what the content of the article might be_, based on the title, and then comment on that, preferably angrily. At _absolute most_ one can read the first paragraph.
3 replies →