← Back to context

Comment by jmalicki

10 hours ago

When people judge blindly, the are more likely to think the human is the AI and the AI is the human.

73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.

https://arxiv.org/abs/2503.23674

Not only are people bad at judging this, but are directionally wrong.

There is research showing the contrary that is far more convincing:

> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.

https://arxiv.org/html/2501.15654v2