Comment by contravariant

1 year ago

Could be, but 'AI model says weird shit' has almost never stuck around unless it's public (which won't happen here), really common, or really blatantly wrong. And usually at least 2 of those three.

For something usually hidden the first two don't really apply that well, and the last would have to be really blatant unless you want an article about "Model recovers from mistake" which is just not interesting.

And in that scenario, it would have to mean the CoT contains something like blatant racism or just a general hatred of the human race. And if it turns out that the model is essentially 'evil' but clever enough to keep that hidden then I think we ought to know.

4 comments

contravariant

fragmede 1 year ago

It's not racism, but from today, here's TechCrunch with: Hacker tricks ChatGPT into giving out detailed instructions for making homemade bombs

https://techcrunch.com/2024/09/12/hacker-tricks-chatgpt-into...

bongodongobob 1 year ago

Just no. AI being racist is still a popular meme. "Because the programmers are white males blah blah".

abenga 1 year ago
Why can't it be, if it were (I'm not saying that it is, mind) trained on racist material?
- bongodongobob 1 year ago
  
  The problem is being kind of right (but not really) for the wrong reasons. Normies think it was told to be a certain way. While kind of true, they think of it more like Eliza.