Comment by photonthug

8 days ago

Doesn’t it seem like LLMs can assist with moderation rather than making it harder?

I’m not sure exactly why we are still waiting for the obviously possible ad-hominem and sunk cost fallacy detectors, etc. For the first time we now have the ability to actually build a threaded comment system that (tries to) insist on rational and on topic discussion. Maybe part of that is that we haven’t actually made the leap yet to wanting to censor non contributing but still-human “ contributers” in addition to spam. I guess shit posting is still part of the “real” attention economy and important for engagement.

The apparently on topic but subtly wrong stuff is certainly annoying and in the case of vaguely relevant and not obviously commercial misinformation or misapprehension, I’m not sure how to tell humans from bots. But otoh you wouldn’t actually need that level of sophistication to clean up the cesspool of most YouTube or twitter threads.

That would presume that the moderation knows the truth, that a single truth even exists and that the moderation itself is unbiased.

It would also presume that an LLM knows the truth, which it does not. Even in technical and mathematical matters it fails.

I do not think an LLM can even accurately detect ad-hominem arguments. Is "you launched a scam coin scheme in the first days of your presidency and therefore I don't trust you on other issues" an ad-hominem or an application of probability theory?

  • Suppose you’re right, then any LLM can still label that as hostile or confrontational. implying that we at least now have the ability to try to filter threads on a simple axis like “arguing” vs “information” vs “anecdote” and in other dimensions much more sophisticated than classic sentiment analysis.

    We might struggle to differentiate information vs disinformation, sure, but the above mentioned new super powers are still kind of remarkable, and easily accessible. And yet that “information only please” button is still missing and we are smashing simple up/down votes like cavemen

    Actually when you think about even classic sentiment analysis capabilities it really shows how monstrous and insidious algorithmic feeds are.. most platforms just don’t want to surrender any control to users at all, even when we have the technology.

> "Doesn’t it seem like LLMs can assist with moderation rather than making it harder?"

The moderators will need to pay for LLM service to solve a problem created by malicious actors who are paying for LLM service also? No wonder the LLM providers have sky-high valuations.

  • Compute providers are gonna get paid, yeah. We can hope though that there’s something asymmetric in the required expense for good guys va bad guys though. For example “subtly embed an ad for X while you pretend to reply to Y” does seem like a harder problem you need a cloud model for. TFA mentioned crypto blog spam, which could easily be detected with keywords and local LLM, or no LLM at all

  • There’s already a lightweight LLM tool for moderation that doesn’t take much to run.

Hey, this is part of my thesis and what I’m working towards figuring out.

People already working on LLMs to assist with content moderation (COPE). Their model can apply a given policy (eg harassment policy,) to a piece of content and judge if it matches the criteria. So the tooling will be made, one way or another.

My support for the thesis is also driven based on how dark the prognostications are.

We won’t be able to distinguish between humans and bots, or even facts soon. The only things which will remain relatively stable are human wants / values and rules / norms.

Bots that encourage pro social behavior, norms and more, are definitely needed just as the natural survival tools we will need.