Comment by nativeit

1 day ago

I see your point, but I don’t see where the author makes any claims about the specifics of the hallucinations, or their impact on the papers’ broader validity. Indeed, I would have found the removal of supposed “innocuous” examples to be far more deceptive than simply calling a spade a spade, and allowing the data to speak for itself.

The author calls the mistakes "confirmed hallucinations" without proof (just more or less evidence). The data never "speak for itself." The author curates the data and crafts a story about it. This story presented here is very suggestive (even using the term "hallucination" is suggestive). But calling it "100 suspected hallucinations", or "25 very likely hallucinations" does less for the author's end goal: selling their service.

  • Obviously a post on a startup's blog will be more editorialized than an academic paper. Still, this seems like an important discussion to have.

The point is that they should focus on the meaningful errors, not the automiation of meaningless errors.