How about Gabrowski et al.: "Wikipedia’s Intentional Distortion of the History of the Holocaust", about the outsize influence of certain coordinated Polish editors on the Wikipedia articles about Poland and the Holocaust?
> This essay has shown that in the last decade, a handful of editors have been steering Wikipedia’s narrative on Holocaust history away from sound, evidence-driven research, toward a skewed version of events touted by right-wing Polish groups. Wikipedia’s articles on Jewish topics, especially on Polish–Jewish history before, during, and after World War II, contain and bolster harmful stereotypes and fallacies. Our study provides numerous examples, but many more exist. We have shown how the distortionist editors add false content and use unreliable sources or misrepresent legitimate ones.
For a more recent paper, "Disinformation as a tool for digital political activism: Croatian Wikipedia and the case for critical information literacy" by Car et al. says that:
> The Hr.WP [Croatian Wikipedia] case exemplifies disinformation not only as content manipulation, but also as process manipulation weaponising neutrality and verifiability policies to suppress dissent and enforce a single ideological position.
I find it more surprising that the common understanding has shifted away from "wikis are crap for anything new or political".
As soon as there is a plausible agenda for selecting a narrative the way Wikipedia works we should be sceptical.
For recent examples, everything to do with Biden and family, and Gamergate. These pages are still full of discussion; and what's written is more ideological than factual. You can follow these pages to see how an in-group selects a narrative.
And these topics are not nearly as controversial as race, feminism, or transgender topics.
The Minnesota Transracial Adoption Study was methodologically flawed. “Children with two black parents were significantly older at adoption, had been in the adoptive home a shorter time, and had experienced a greater number of preadoption placements.”
Reframed, the study seemed to find (a) black kids are adopted less readily and (b) the longer a kid spends in the foster system, the lower their IQ at 17. (There is also limited controlling for epigenetic factors because we didn’t understand those well in the 1970s and 80s.)
Based on how new human cognition is, and genetically similar human races are, it would be somewhat groundbreaking to find an emergent complex trait like IQ to map to social constructs like race, particularly ones as broad as American white and black. (There is more genetic diversity in single African tribes than in some small European countries. And American whites and blacks are all complex hybridized social categories.)
It seems like the root of your statement is with the existence of "race" as a purely biological classification. Wikipedia correctly notes the consensus position that race is a social construct [0] that's difficult to use accurately when discussing IQ. Grok makes the implicit and incorrect assumption that genetic factors = race, among other issues.
>As you can see, Wikipedia is very dismissive to the point of effectively lying.
Did I miss where you presented evidence that wikipedia is wrong? You seem to be taking an assumption you carry (race is related to IQ) and assuming everyone believes it's true as well, thus wikipedia is lying.
I can understand somebody not liking wikipedia, I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
> I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
Really? Have you used AI to write documentation for software? Or used AI to generate deep research reports by scouring the internet?
Because, while both can have some issues (but so do humans), AI already does extremely well at both those tasks (multiple models do, look at the various labs' Deep Research products, or look at NotebookLM).
Grokipedia is roughly the same concept of "take these 10,000 topics, and for each topic make a deep research report, verify stuff, etc, and make minimal changes to the existing deep research report on it. preserve citations"
So it's not like it's automatically some anti-woke can't-be-trusted thing. In fact, if you trust the idea of an AI doing deep research reports, this is a generalizable and automated form of that.
We can judge an idea by its merits, politics aside. I think it's a fascinating idea in general (like the idea of writing software documentation or doing deep research reports), whether it needs tweaks to remove political bias aside.
No, I don't trust an encyclopedia generated by AI. Projects with much narrower scopes are not comparable.
edit: I am not very excited by AI-generated documentations either. I think that LLMs are very useful tools, but I see a potential problem when the sources of information that their usefulness is largely based on is also LLM-generated. I am afraid that this will inevitably result in drop in quality that will also affect the LLMs themselves downstream. I think we underestimate the importance that intentionality in human-written text plays in being in the training sets/context windows of LLMs for them to give relevant/useful output.
> Have you used AI to write documentation for software?
Hi. I have edited AI-generated first drafts of documentation -- in the last few months, so we are not talking about old and moldy models -- and describing the performance as "extremely well" is exceedingly generous. Large language models write documentation the same way they do all tasks, i.e., through statistical computation of the most likely output. So, in no particular order:
- AI-authored documentation is not aware of your house style guide. (No, giving it your style guide will not help.)
- AI-authored documentation will not match your house voice. (No, saying "please write this in the voice of the other documentation in this repo" will not help.)
- The generated documentation will tend to be extremely generic and repetitive, often effectively duplicating other work in your documentation repo.
- Internal links to other pages will often be incorrect.
- Summaries will often be superfluous.
- It will love "here is a common problem and here is how to fix it" sections, whether or not that's appropriate for the kind of document it's writing. (It won't distinguish reliably between tutorial documentation, reference documentation, and cookbook articles.)
- The common problems it tells you how to fix are sometimes imagined and frequently not actually problems worth documenting.
- It's subject to unnecessary digression, e.g., while writing a high-level overview of how to accomplish a task, it will mention that using version control is a good idea, then detour for a hundred lines giving you a quick introduction to Git.
As for using AI "to generate deep research reports by scouring the internet", that sounds like an incredibly fraught idea. LLMs are not doing searches, they are doing statistical computation of likely results. In practice the results of that computation and a web search frequently line up, but "frequently" is not good enough for "deep research": the fewer points of reference for a complex query there are in an LLM's training corpus, the more likely it is to generate a bullshit answer delivered with a veneer of absolute confidence. Perhaps you can make the case that that's still a good place to start, but it is absolutely not something to rely on.
So you can understand someone not liking something, but you cannot understand that person liking the idea of an alternative? What is the idea for you if not just an alternative to the established service with the undesired part changed?
Because not liking something does not imply liking any possible alternative.
Which one is the "undesirable part changed" here? Wikipedia is written by humans, it has a not-for-profit governance model, it encompasses a large, international community of authors/editors that attempt to operate democratically, it has an investment/commitment in being an openly available and public source of information. Grokipedia, on the other hand, is AI-generated, and operated by a for-profit AI company. Even if "grokipedia" managed somehow to get traction and "overthrow" wikipedia, there is no reason on earth why a company would operate it for free and not try to make profit out of it, or use it for their ends in ways much more direct than what may or may not be happening to wikipedia. Having a billionaire basically control something that may be considered "ground truth" of information seems a bad idea, and having AI generate that an even worse one.
I can understand somebody not liking something in how wikipedia is governed or operating, after all whatever has to do with getting humans work together in such a scale is bound to be challenging. I can understand somebody ideologically disagreeing with some of the stances that such a project has to take eventually (even if one tries to be neutral as much as possible, it is inevitable to avoid some clash somewhere about where this neutrality exactly lies). But grokipedia much more than "wikipedia but different ideologically".
edit: just to be clear, I see a critique of the "idea of grokipedia" as eg the critique of it being a billionaire controlled, AI generated project to substitute wikipedia; a critique of the implementation would be finding flaws to actual articles in grokipedia (overall). I think the idea of it is already flawed enough.
Elon at some point threatened to have an LLM rewrite all of the training data to remove woke. I assume Grokipedia is his experiment at doing this (and perhaps hoping it will infect other training sets too?) ...
Are there actual good examples showing errors of fact on Wikipedia that are verifiably incorrect, that demonstrate how it is "captured"?
How about Gabrowski et al.: "Wikipedia’s Intentional Distortion of the History of the Holocaust", about the outsize influence of certain coordinated Polish editors on the Wikipedia articles about Poland and the Holocaust?
https://www.tandfonline.com/doi/epdf/10.1080/25785648.2023.2...
Quote from the conclusion:
> This essay has shown that in the last decade, a handful of editors have been steering Wikipedia’s narrative on Holocaust history away from sound, evidence-driven research, toward a skewed version of events touted by right-wing Polish groups. Wikipedia’s articles on Jewish topics, especially on Polish–Jewish history before, during, and after World War II, contain and bolster harmful stereotypes and fallacies. Our study provides numerous examples, but many more exist. We have shown how the distortionist editors add false content and use unreliable sources or misrepresent legitimate ones.
For a more recent paper, "Disinformation as a tool for digital political activism: Croatian Wikipedia and the case for critical information literacy" by Car et al. says that:
> The Hr.WP [Croatian Wikipedia] case exemplifies disinformation not only as content manipulation, but also as process manipulation weaponising neutrality and verifiability policies to suppress dissent and enforce a single ideological position.
https://doi.org/10.1108/JD-01-2025-0020
I find it more surprising that the common understanding has shifted away from "wikis are crap for anything new or political".
As soon as there is a plausible agenda for selecting a narrative the way Wikipedia works we should be sceptical.
For recent examples, everything to do with Biden and family, and Gamergate. These pages are still full of discussion; and what's written is more ideological than factual. You can follow these pages to see how an in-group selects a narrative.
And these topics are not nearly as controversial as race, feminism, or transgender topics.
OK, is there a specific example on either the Biden or Gamergate page that is factually incorrect? Or are you saying the entire pages are false?
9 replies →
[flagged]
The Minnesota Transracial Adoption Study was methodologically flawed. “Children with two black parents were significantly older at adoption, had been in the adoptive home a shorter time, and had experienced a greater number of preadoption placements.”
Reframed, the study seemed to find (a) black kids are adopted less readily and (b) the longer a kid spends in the foster system, the lower their IQ at 17. (There is also limited controlling for epigenetic factors because we didn’t understand those well in the 1970s and 80s.)
Based on how new human cognition is, and genetically similar human races are, it would be somewhat groundbreaking to find an emergent complex trait like IQ to map to social constructs like race, particularly ones as broad as American white and black. (There is more genetic diversity in single African tribes than in some small European countries. And American whites and blacks are all complex hybridized social categories.)
[1] https://en.wikipedia.org/wiki/Minnesota_Transracial_Adoption...
5 replies →
It seems like the root of your statement is with the existence of "race" as a purely biological classification. Wikipedia correctly notes the consensus position that race is a social construct [0] that's difficult to use accurately when discussing IQ. Grok makes the implicit and incorrect assumption that genetic factors = race, among other issues.
[0] https://www.genome.gov/genetics-glossary/Race
8 replies →
Have you considered the possibility that your opinion is just not representative of the scientific consensus?
4 replies →
>As you can see, Wikipedia is very dismissive to the point of effectively lying.
Did I miss where you presented evidence that wikipedia is wrong? You seem to be taking an assumption you carry (race is related to IQ) and assuming everyone believes it's true as well, thus wikipedia is lying.
3 replies →
[flagged]
It's not errors of fact, it's errors of omitted facts.
Are there actual good examples showing errors of omitted facts on Wikipedia that are verifiably correct, that demonstrate how it is "captured"?
[flagged]
I’d say Wikipedia definitely has a strong “woke” bent to it. Either in the language or the choice of what facts to show. Here’s an example I deleted that had been there for quite a while https://en.wikipedia.org/w/index.php?title=Salvadoran_gang_c...
I really like Wikipedia, though, and I think over time we will get around to fixing it up.
Why did you feel this passage was worth deleting?
2 replies →
I can understand somebody not liking wikipedia, I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
> I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
Really? Have you used AI to write documentation for software? Or used AI to generate deep research reports by scouring the internet?
Because, while both can have some issues (but so do humans), AI already does extremely well at both those tasks (multiple models do, look at the various labs' Deep Research products, or look at NotebookLM).
Grokipedia is roughly the same concept of "take these 10,000 topics, and for each topic make a deep research report, verify stuff, etc, and make minimal changes to the existing deep research report on it. preserve citations"
So it's not like it's automatically some anti-woke can't-be-trusted thing. In fact, if you trust the idea of an AI doing deep research reports, this is a generalizable and automated form of that.
We can judge an idea by its merits, politics aside. I think it's a fascinating idea in general (like the idea of writing software documentation or doing deep research reports), whether it needs tweaks to remove political bias aside.
No, I don't trust an encyclopedia generated by AI. Projects with much narrower scopes are not comparable.
edit: I am not very excited by AI-generated documentations either. I think that LLMs are very useful tools, but I see a potential problem when the sources of information that their usefulness is largely based on is also LLM-generated. I am afraid that this will inevitably result in drop in quality that will also affect the LLMs themselves downstream. I think we underestimate the importance that intentionality in human-written text plays in being in the training sets/context windows of LLMs for them to give relevant/useful output.
> Have you used AI to write documentation for software?
Hi. I have edited AI-generated first drafts of documentation -- in the last few months, so we are not talking about old and moldy models -- and describing the performance as "extremely well" is exceedingly generous. Large language models write documentation the same way they do all tasks, i.e., through statistical computation of the most likely output. So, in no particular order:
- AI-authored documentation is not aware of your house style guide. (No, giving it your style guide will not help.)
- AI-authored documentation will not match your house voice. (No, saying "please write this in the voice of the other documentation in this repo" will not help.)
- The generated documentation will tend to be extremely generic and repetitive, often effectively duplicating other work in your documentation repo.
- Internal links to other pages will often be incorrect.
- Summaries will often be superfluous.
- It will love "here is a common problem and here is how to fix it" sections, whether or not that's appropriate for the kind of document it's writing. (It won't distinguish reliably between tutorial documentation, reference documentation, and cookbook articles.)
- The common problems it tells you how to fix are sometimes imagined and frequently not actually problems worth documenting.
- It's subject to unnecessary digression, e.g., while writing a high-level overview of how to accomplish a task, it will mention that using version control is a good idea, then detour for a hundred lines giving you a quick introduction to Git.
As for using AI "to generate deep research reports by scouring the internet", that sounds like an incredibly fraught idea. LLMs are not doing searches, they are doing statistical computation of likely results. In practice the results of that computation and a web search frequently line up, but "frequently" is not good enough for "deep research": the fewer points of reference for a complex query there are in an LLM's training corpus, the more likely it is to generate a bullshit answer delivered with a veneer of absolute confidence. Perhaps you can make the case that that's still a good place to start, but it is absolutely not something to rely on.
1 reply →
> "grokipedia" as idea
So you can understand someone not liking something, but you cannot understand that person liking the idea of an alternative? What is the idea for you if not just an alternative to the established service with the undesired part changed?
Because not liking something does not imply liking any possible alternative.
Which one is the "undesirable part changed" here? Wikipedia is written by humans, it has a not-for-profit governance model, it encompasses a large, international community of authors/editors that attempt to operate democratically, it has an investment/commitment in being an openly available and public source of information. Grokipedia, on the other hand, is AI-generated, and operated by a for-profit AI company. Even if "grokipedia" managed somehow to get traction and "overthrow" wikipedia, there is no reason on earth why a company would operate it for free and not try to make profit out of it, or use it for their ends in ways much more direct than what may or may not be happening to wikipedia. Having a billionaire basically control something that may be considered "ground truth" of information seems a bad idea, and having AI generate that an even worse one.
I can understand somebody not liking something in how wikipedia is governed or operating, after all whatever has to do with getting humans work together in such a scale is bound to be challenging. I can understand somebody ideologically disagreeing with some of the stances that such a project has to take eventually (even if one tries to be neutral as much as possible, it is inevitable to avoid some clash somewhere about where this neutrality exactly lies). But grokipedia much more than "wikipedia but different ideologically".
edit: just to be clear, I see a critique of the "idea of grokipedia" as eg the critique of it being a billionaire controlled, AI generated project to substitute wikipedia; a critique of the implementation would be finding flaws to actual articles in grokipedia (overall). I think the idea of it is already flawed enough.
They meant the idea of Wikipedia rewritten by Grok (or another controversial LLM) specifically, not just any alternative.
Not all alternatives are necessarily worthy. I can understand someone not liking tomatoes. I can't understand someone liking depleted uranium.
2 replies →
Elon at some point threatened to have an LLM rewrite all of the training data to remove woke. I assume Grokipedia is his experiment at doing this (and perhaps hoping it will infect other training sets too?) ...
I appreciate you