Comment by floxy
19 hours ago
We'll I'd think that YouTube would have incentive to get it right. Either there are too many false positives and the content creators go away and YouTube collapses. Or there are too many false negatives and the viewers go away, and YouTube collapses. I mean there is a chance that garbage people will ruin video sharing platforms for everyone.
Having the incentive to do something and having the ability to do it are not the same thing.
It's not like human-generated content is made of carbon and AI-generated content is made of silicon and the science of chemistry can unambiguously tell them apart. If you asked a million humans and a million LLMs to write a sentence on a specific subject, it's not implausible that one of the LLMs and one of the humans would output the exact same sentence. Maybe more than one.
A thing that can take only the output and accurately tell you if it was AI-generated or not is therefore impossible, because if it said no it would be wrong when the LLM generates that sentence, but if it said yes it would be wrong when a human generates the exact same sentence.
All it can do is try to calculate a probability. But then what do you want to do with that? Suppose the probability it estimates for some content is 45%, and that probability estimate is an accurate measure of the true probability, i.e. can't be improved when the only information you have is the content itself. Do you want to ban the 55% of that content which is human-generated, or allow the 45% which is AI-generated?
Right now the problem is the flood of low-quality AI spam that might (or might not) be low hanging fruit. We can worry about high quality AI artifacts later if that becomes a problem. (and yes, there is no guarantee that YouTube won't fail due to these spammers)
But is an algorithmic AI detector really a thing?
I get the idea: get 10k each samples of human data and AI data, train a simple classifier until it gets 99.9999% accuracy or <10k false negatives per day at your scale, ship it as a screening tool.
Is such tool feasible at all with current state of AI technology, or is it just a reasonable take from the past that may not be so reasonable anymore?
2 replies →