Comment by freehorse
14 hours ago
I can understand somebody not liking wikipedia, I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
14 hours ago
I can understand somebody not liking wikipedia, I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
Elon at some point threatened to have an LLM rewrite all of the training data to remove woke. I assume Grokipedia is his experiment at doing this (and perhaps hoping it will infect other training sets too?) ...
> I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
Really? Have you used AI to write documentation for software? Or used AI to generate deep research reports by scouring the internet?
Because, while both can have some issues (but so do humans), AI already does extremely well at both those tasks (multiple models do, look at the various labs' Deep Research products, or look at NotebookLM).
Grokipedia is roughly the same concept of "take these 10,000 topics, and for each topic make a deep research report, verify stuff, etc, and make minimal changes to the existing deep research report on it. preserve citations"
So it's not like it's automatically some anti-woke can't-be-trusted thing. In fact, if you trust the idea of an AI doing deep research reports, this is a generalizable and automated form of that.
We can judge an idea by its merits, politics aside. I think it's a fascinating idea in general (like the idea of writing software documentation or doing deep research reports), whether it needs tweaks to remove political bias aside.
No, I don't trust an encyclopedia generated by AI. Projects with much narrower scopes are not comparable.
edit: I am not very excited by AI-generated documentations either. I think that LLMs are very useful tools, but I see a potential problem when the sources of information that their usefulness is largely based on is also LLM-generated. I am afraid that this will inevitably result in drop in quality that will also affect the LLMs themselves downstream. I think we underestimate the importance that intentionality in human-written text plays in being in the training sets/context windows of LLMs for them to give relevant/useful output.
> Have you used AI to write documentation for software?
Hi. I have edited AI-generated first drafts of documentation -- in the last few months, so we are not talking about old and moldy models -- and describing the performance as "extremely well" is exceedingly generous. Large language models write documentation the same way they do all tasks, i.e., through statistical computation of the most likely output. So, in no particular order:
- AI-authored documentation is not aware of your house style guide. (No, giving it your style guide will not help.)
- AI-authored documentation will not match your house voice. (No, saying "please write this in the voice of the other documentation in this repo" will not help.)
- The generated documentation will tend to be extremely generic and repetitive, often effectively duplicating other work in your documentation repo.
- Internal links to other pages will often be incorrect.
- Summaries will often be superfluous.
- It will love "here is a common problem and here is how to fix it" sections, whether or not that's appropriate for the kind of document it's writing. (It won't distinguish reliably between tutorial documentation, reference documentation, and cookbook articles.)
- The common problems it tells you how to fix are sometimes imagined and frequently not actually problems worth documenting.
- It's subject to unnecessary digression, e.g., while writing a high-level overview of how to accomplish a task, it will mention that using version control is a good idea, then detour for a hundred lines giving you a quick introduction to Git.
As for using AI "to generate deep research reports by scouring the internet", that sounds like an incredibly fraught idea. LLMs are not doing searches, they are doing statistical computation of likely results. In practice the results of that computation and a web search frequently line up, but "frequently" is not good enough for "deep research": the fewer points of reference for a complex query there are in an LLM's training corpus, the more likely it is to generate a bullshit answer delivered with a veneer of absolute confidence. Perhaps you can make the case that that's still a good place to start, but it is absolutely not something to rely on.
>LLMs are not doing searches, they are doing statistical computation of likely results.
This was true of ChatGPT in 2022, but any modern platform that advertises a "deep research" feature provides its LLMs with tools to actually do a web search, pull the results it finds into context and cite them in the generated text.
> "grokipedia" as idea
So you can understand someone not liking something, but you cannot understand that person liking the idea of an alternative? What is the idea for you if not just an alternative to the established service with the undesired part changed?
Because not liking something does not imply liking any possible alternative.
Which one is the "undesirable part changed" here? Wikipedia is written by humans, it has a not-for-profit governance model, it encompasses a large, international community of authors/editors that attempt to operate democratically, it has an investment/commitment in being an openly available and public source of information. Grokipedia, on the other hand, is AI-generated, and operated by a for-profit AI company. Even if "grokipedia" managed somehow to get traction and "overthrow" wikipedia, there is no reason on earth why a company would operate it for free and not try to make profit out of it, or use it for their ends in ways much more direct than what may or may not be happening to wikipedia. Having a billionaire basically control something that may be considered "ground truth" of information seems a bad idea, and having AI generate that an even worse one.
I can understand somebody not liking something in how wikipedia is governed or operating, after all whatever has to do with getting humans work together in such a scale is bound to be challenging. I can understand somebody ideologically disagreeing with some of the stances that such a project has to take eventually (even if one tries to be neutral as much as possible, it is inevitable to avoid some clash somewhere about where this neutrality exactly lies). But grokipedia much more than "wikipedia but different ideologically".
edit: just to be clear, I see a critique of the "idea of grokipedia" as eg the critique of it being a billionaire controlled, AI generated project to substitute wikipedia; a critique of the implementation would be finding flaws to actual articles in grokipedia (overall). I think the idea of it is already flawed enough.
They meant the idea of Wikipedia rewritten by Grok (or another controversial LLM) specifically, not just any alternative.
Not all alternatives are necessarily worthy. I can understand someone not liking tomatoes. I can't understand someone liking depleted uranium.
Maybe ask a Ukrainian soldier which they prefer (modern armor is often made of depleted uranium). Environment shapes such preferences far more than personality.
what do you have against depleted uranium? you know what they say, one man’s trash is another man’s treasure :)