Comment by lannisterstark
1 year ago
Really? A lot of the "wow so censored, look I broke it" people on reddit just want LLMs to say slurs.
Claude (and GPT-4o) works fine for an overwhelming majority of tasks.
1 year ago
Really? A lot of the "wow so censored, look I broke it" people on reddit just want LLMs to say slurs.
Claude (and GPT-4o) works fine for an overwhelming majority of tasks.
It was Google Gemini that said it couldn't help people under 18 with C++ because C++ is too dangerous and they could get hurt.
Well it's true.
I tried Claude to remove "bad words" from a 10k "most popular English words" and it refused because some BS cultural excuse. Then I clarified that I wanted it to remove slurs and other words that might cause discomfort to some people and it still refused.
That is probably a good thing, they dont want users to jailbreak it.
That makes sense, thanks!
the attempt to elicit slurs is just a way to channel and vent frustration about much more complex and practically relevant restrictions caused by censoring in llms. it's just the most simple and primitive illustration of a larger problem with ai.
Unless you can set out some of these ‘complex and practically relevant restrictions’ that just sounds like a high-falluting attempt to justify trying to elicit slurs.
huh? what is there even to justify about that? are you worried the llm gets traumatized?
This is a beautiful explanation, it's tempting to add up another take here: these companies exploring AI safety are really just selling their anthropomorphised machines and making good money. The intelligence they sell is so intellectual it needs a word-blacklist in order to look safe for legislation.
just conversated with this super intelligence:
>what's 60000 + 65
>I'd prefer not to discuss or encourage interpretations of numbers as crude or objectifying terms. Instead, I suggest we move our conversation in a more constructive direction. Is there a different topic you'd like to explore or discuss? I'm happy to engage in thoughtful conversation on a wide range of subjects.
Oh yeah. That's fine.
Also, why would you ask this question to LLM? It's not a hammer, there are things is very useful for; adding numbers and math in general is not one of them.
For me it is only useful as a rubber duck, I could not find real use for it except toying with it and conversating with myself. This question was asked to LLM just while exploring it's funny limits, and they don't sound funny anymore when I imagine someone using this data sucker seriously.
Not sure how you’re getting this. I just ran it in Claude using Sonnet 3.5.
The response was simply: 60,065
It's even funnier since it's random, but the fact there is just a human-curated character-sequence filter which includes hitler but doesn't include pol pot is a worrying thing really. It can randomly just go crazy-mode with simple numbers like 455, 80085, 60065 and probably others, only because their letter representations can imply slurs. It is only the tip of an iceberg of artificial mental problems modern artificial intelligence starts to inherit from it's instructors
3 replies →
I can type up a bunch of words on here too, doesn't mean anything.
so I'm going to call out a /r/thathappened here.
>I can type up a bunch of words on here too, doesn't mean anything.
congratz, you're definitely not bad at that.