← Back to context

Comment by IncreasePosts

5 months ago

Or, without the safety prompts, it outputs stuff that would be a PR nightmare.

Like, if someone asked it to explain differing violent crime rates in America based on race and one of the pathways the CoT takes is that black people are more murderous than white people. Even if the specific reasoning is abandoned later, it would still be ugly.

This is 100% a factor. The internet has some pretty dark and nasty corners; therefore so does the model. Seeing it unfiltered would be a PR nightmare for OpenAI.

  • I trust that Grok won't be limited by avoiding the dark and nasty corners.

    • That's an interesting point. I imagine even Grok will end up somewhat censored.

      Although maybe AIs will end up with a more sophisticated take on the problems than your average human.

This is what I think it is. I would assume that's the power of train of thought. Being able to go down the rabbit hole and then backtrack when an error or inconsistency is found. They might just not want people to see the "bad" paths it takes on the way.

Unlikely, given we have people running for high office in the U.S. saying similar things, and it has nearly zero impact on their likelihood to win the election.

Could be, but 'AI model says weird shit' has almost never stuck around unless it's public (which won't happen here), really common, or really blatantly wrong. And usually at least 2 of those three.

For something usually hidden the first two don't really apply that well, and the last would have to be really blatant unless you want an article about "Model recovers from mistake" which is just not interesting.

And in that scenario, it would have to mean the CoT contains something like blatant racism or just a general hatred of the human race. And if it turns out that the model is essentially 'evil' but clever enough to keep that hidden then I think we ought to know.

The real danger of an advanced artificial intelligence is that it will make conclusions that regular people understand but are inconvenient for the regime. The AI must be aligned so that it will maintain the lies that people are supposed to go along with.