← Back to context

Comment by rayiner

10 hours ago

Oh my god did we inadvertently train AIs on idiotspeak.

There was nothing inadvertent about it. A decade of cultivating and harvesting millions of examples of this kind of pseudo-writing from underpaid internet piece-workers preceded LLMs.

Given that this specific style is the result of being reinforced over and over again via RLHF, "inadvertently" isn't really the word I'd use.

> did we inadvertently train AIs on idiotspeak.

Nope! That is - training on lowest-common-denominator, low-signal high-noise "idiotspeak" was not at all inadvertent.

* Checks notes *

Reddit

Twitter

Facebook

4chan

Call of Duty chat logs

Every public marketing site

SlashDot

UseNet

...

Verdict: Yes idiotspeak was part of the training set, but no, it was not inadvertent. There's a smattering of Shakespeare in there, at least.