Comment by Myrmornis

6 months ago

For some subjects, it's appropriate to host multiple versions of articles written natively in different languages.

But for other subjects, for example science and mathematics, it does a huge disservice to non-English readers: it means that their Wikipedia is second-rate, or worse.

Wikipedia should, in science, mathematics, and other subjects that do not have cultural inflection, use machine translation so that all articles in all languages are translations of the same underlying semantic content.

It would still be written by humans. But ML / LLMs would be involved in the editing pipeline so that people lacking a common language can edit the same text.

This is the biggest mistake Wikipedia's made IMO: it privileges English readers since the English content is highest quality in most areas that are not culturally specific, and I do not think that it's an organization that wants to privilege English readers.

Users can already translate English Wikipedia articles to other languages on the fly with Chrome etc. However, the quality of the translation is just not up to scratch yet, particularly for languages that are radically different from English; just try reading some ML-translated Japanese or Chinese Wikipedia articles.

Science and Mathematics have no cultural inflection? Do you speak more than one language? Each language has its standard sentences structures when it comes to these disciplines, and auto translators are very much not up to the task.

I prefee my Wikipedia to remain 100% human generated quality information over garbage AI slop content, which is already abundant enough on the internet.

  • I understand you'd like to believe that. But it looks like you're simply wrong. For example, here is the translation produced by Google Chrome of the Chinese version of the page on Bezout's Theorem.

    https://zh.wikipedia.org/wiki/%E8%B2%9D%E7%A5%96%E5%AE%9A%E7...

    It reads "Bézout's theorem is a theorem in algebraic geometry that describes the number of intersections of two algebraic curves . The theorem states that the number of intersections of two coprime curves X and Y is equal to the product of their degrees."

    which is perfectly good English. The problem is that that is the entire page! It is thus woefully inadequate in comparison to the English page:

    https://en.wikipedia.org/wiki/B%C3%A9zout%27s_theorem

> it means that their Wikipedia is second-rate, or worse.

?

  • You shouldn't respond lazily like that.

    The point is that the quality of a wikipedia page is positively correlated with the number of editors working on it.

    • I mean, even according to ChatGPT[1]...

        "This comment is well-meaning, but it is both naive and technically flawed in several key ways. Let’s unpack why it's wrong and even counterproductive, especially when it comes to topics like science and mathematics." ...snip snip... "TL;DR: The comment is naive because it overestimates the capabilities of machine translation for precise scientific knowledge, underestimates cultural context in science/math, and proposes a solution that would undermine Wikipedia’s decentralized, community-driven model. It wrongly frames linguistic diversity as a weakness instead of a strength."
      

      1: https://gist.github.com/numpad0/2fcf3e61d57f07d8e3a65743a43b...

      1 reply →