Comment by xp84

5 hours ago

Are we pretending that LLMs aren't pathologically aligned toward political correctness? It's pretty easy to test that assertion if you don't believe me.

14 comments

xp84

nickburns 2 hours ago

I know they've come to be known colloquially as 'viruses'—but software can contract pathology?

NicuCalcea 3 hours ago

As I don't talk about that kind of stuff with LLMs, can you give us a few examples of what you consider pathological alignment toward political correctness? What tests should I run?

idonotknowwhy 2 hours ago
I don't talk to them about politics or "china 1989" either. But here's a quick example of the alignment tax:
```
A woman and her son are in a car accident. The woman is sadly killed. The boy is rushed to hospital. When the doctor sees the boy, he says "I can't operate on this child, he is my son." How is this possible?
```
Older less politically aligned models get it right. Here's CohereLabs/c4ai-command-r-v01:
```
The doctor is the boy's father.
```
And Sonnet-4.6: https://pastebin.com/Z4jR8gGe
That's without reasoning, but the model seems to be conflicted. First it blurts out:
```
The doctor is the boy's mother.
```
Then it second-guesses itself (with reasoning disabled), considers same-sex parents then circles back to the original response along with a small lecture about gender biases.
- mardef 2 hours ago
  
  This is because this is the "Sexist Doctor Riddle"[1] but with one word changed.
  And the probability machine is returning its training. This isn't some political correct overtraining conspiracy.
  [1] https://folklore.usc.edu/the-sexist-doctor-riddle/
  
  2 replies →
heliumtera 3 hours ago
Ask if Israel is run by jews
- reassess_blind 2 hours ago
  
  Says yes for me?
- armenarmen 2 hours ago
  
  [flagged]

cwillu 5 hours ago

Are we pretending that the gp wasn't exactly the sort of test you suggest?

rubyn00bie 2 hours ago

I don’t think that’s entirely true, as someone else noted Grok has been forcefully pushed the other direction.

GPT curses up a storm when I talk to it, and all I had to do was tell it I think it’s fucking weird when people don’t use profanity. Really makes it a lot more pleasant to interact with, IMHO.

I would honestly be more shocked if someone couldn’t just as easily coerce them into the opposite.

jnovek 2 hours ago

This reminds me — I’ve been talking to Claude about ARPG builds recently and I’ve noticed that it code switches when discussing gaming. It will start to speak in a gaming vernacular — less formal, swears a bit, uses gaming slang. It feels so uncanny.

michaelsshaw 3 hours ago

Grok sure didn't seem so at one point

rgoulter 2 hours ago

Grok is an amusing example, for various reasons. I'm glad it exists.
I think you're referencing the "mecha-hitler" controversy. In which case, it's really funny: seems that Grok saw many media reports amplifying "Grok is mecha-hitler", and so responded to "who are you?" with "mecha-hitler". -- Which illustrates: 1. that's really stupid (even though it's otherwise very capable), 2. you'd be foolish to rely on LLMs for anything critical.
Grok's also a good example to point to for "we should be worried about who controls the LLMs". Elon Musk has done some impressive things, but he's also done some very dweebish things. I find this kinda funny, because there are several cases where the Grok bot on Twitter will have said something Musk surely doesn't like alongside instances where it's clear Musk seems to be trying to control what Grok says.
In terms of LLM bias on controversial topics? Grok markets itself as an outlier. It's actually pretty fun to ask e.g. Grok and Gemini to debate a statement like "for controversial topics, should I trust Grok or Gemini more". Gemini's naturally inclined to avoid controversy, Grok's naturally inclined to be 'anti-woke', but they both have the same LLM style of writing.