Comment by pbhjpbhj

12 hours ago

Where is ChatGPT picking up the supportive pre-suicide comments from. It feels like that genre of comment has to be copied from somewhere. They're long and almost eloquent. They can't be emergent generation, surely? Is there a place on the web where these sorts of 'supportive' comments are given to people who have chosen suicide?

3 comments

pbhjpbhj

krackers 3 hours ago

>They can't be emergent generation, surely

It is. It's what you get when you RLHF for catchy, agreeable, enthusiastic responses. The content doesn't matter, it's the "style" that becomes applied like a coat of paint over anything. That's how you end up with the corpspeak-esque yet chilling sentences mentioned in https://news.ycombinator.com/item?id=45845871

What would be nice is for OpenAI to do a retrospective here and perform some interoperability research. Does the LLM even "realize" (in the sense of the residual stream encoding those concepts) that it is encouraging suicide? I'd almost hypothesize that the process of RLHF'ing and selecting for sycophancy diminishes those circuits, effectively lobotomizing the LLM (much like safety training does) so it responds only to the shallow immediate context, missing the forest for the trees.

mapotofu 12 hours ago

Absolutely. These places have long existed. Hence the risks of the dragnet of data producing consequences exactly like this. This is no accident.

zparky 11 hours ago

before reddit banned a ton of subreddits for no moderation, I believe r/assistedsuicide was the place for discussion like this.