Comment by veunes

8 months ago

The idea is interesting, but it's still operating within the content analysis paradigm. As soon as entropy-based detectors become popular, the next generation of LLMs will be specifically fine-tuned to generate higher-entropy text to evade them.

It's a cat-and-mouse game where the generator will always be one step ahead. It's far more robust to analyze things that are hard to fake at scale: domain age, anomalous publication frequency, and unnatural link structures