Comment by patapong

8 months ago

I distinctly remember someone making an experiment by asking ChatGPT to write jokes (?) about different groups and calculating the likelihood of it refusing, to produce a ranking. I think it was a medium article, but now I cannot find it anymore. Does anyone have a link?

EDIT: At least here is a paper aiming to predict ChatGPT prompt refusal https://arxiv.org/pdf/2306.03423 with an associated dataset https://github.com/maxwellreuter/chatgpt-refusals

EDIT2: Aha, found it! https://davidrozado.substack.com/p/openaicms An interesting graph is about 3/4 down the page, showing what ChatGPT moderation considers to be hateful.

3 comments

patapong

bryceacc 8 months ago

that is a crazy read, thanks for the added links. I wonder if these effects are all because it was trained on the internet, and the internet is generally outspoken on the left side?

xtracto 8 months ago

Thanks for this. As someone who is not from the US nor for China, I am getting so tired of this narrative of how bad DeepSeek is because it sensors X or Y things. The reality is that all internet services censor something, it is just a matter of one choosing what service is more useful for the task given the censorship.

As someone from a third world country (the original meaning of the word) I couldn't care less about US or Chinese political censorship in any model or service.

_gnad_ 8 months ago

[dead]