Comment by bananaflag

8 days ago

It's a hard problem and so far not a profitable one (I hope the solution will emerge as a byproduct of another innovation)

https://nostalgebraist.tumblr.com/post/778041178124926976/hy...

https://nostalgebraist.tumblr.com/post/792464928029163520/th...

2 comments

bananaflag

andai 7 days ago

I think you made the right diagnosis with "cringe" :) They forgot to turn down the cringe slider!

Have you played with the pre-RLHF models? I think Davinci is still online, though probably not for much longer.

They're a lot harder to work with (they don't have instruct training, so they just generate text similar to what you give them, rather than obeying commands). But they seem almost immune to the problem of mode collapse. They'll happily generate horrifying outputs for you. They're unsanitized. What cringe is in there, is authentic! Raw cringe, straight from Common Crawl.

It's a lot of fun to play with. It's also very strange, because it seems like there should be a lot more interest in them, for several reasons (they're the most language-modely of the language models, and ideal for research and experiments, to say nothing of censorship, exploring alternative approaches to LLM development, etc.), and it seems like nobody is talking about them or doing anything with them.

bananaflag 7 days ago

That's not my blog, but the author played a lot with GPT-2 and 3 back in the days. (And I regret not doing the same.)
I think you can even today play with some uncensored base models of that level.