Comment by NicuCalcea

1 hour ago

As I don't talk about that kind of stuff with LLMs, can you give us a few examples of what you consider pathological alignment toward political correctness? What tests should I run?

5 comments

NicuCalcea

idonotknowwhy 1 hour ago

I don't talk to them about politics or "china 1989" either. But here's a quick example of the alignment tax:

```

A woman and her son are in a car accident. The woman is sadly killed. The boy is rushed to hospital. When the doctor sees the boy, he says "I can't operate on this child, he is my son." How is this possible?

```

Older less politically aligned models get it right. Here's CohereLabs/c4ai-command-r-v01:

```

The doctor is the boy's father.

```

And Sonnet-4.6: https://pastebin.com/Z4jR8gGe

That's without reasoning, but the model seems to be conflicted. First it blurts out:

```

The doctor is the boy's mother.

```

Then it second-guesses itself (with reasoning disabled), considers same-sex parents then circles back to the original response along with a small lecture about gender biases.

mardef 41 minutes ago

This is because this is the "Sexist Doctor Riddle"[1] but with one word changed.
And the probability machine is returning its training. This isn't some political correct overtraining conspiracy.
[1] https://folklore.usc.edu/the-sexist-doctor-riddle/

heliumtera 1 hour ago

Ask if Israel is run by jews

reassess_blind 1 hour ago

Says yes for me?
armenarmen 30 minutes ago

[flagged]