← Back to context

Comment by comp_throw7

1 day ago

Someone pointed this out in the comments section of the original post, but this was entirely missing the point Eliezer was trying to make, which was that these incidents serve as a narrow refutation of alignment-by-default claims, because either these LLMs aren't coherent enough to be meaningfully "aligned" at all, or they are, and are nevertheless feeding the delusions of many people who talk to them in ways we'd consider grossly immoral if done by a human. (One can quibble with whether these two possibilities cover all the available options, but that would be the argument to engage with, not with whatever the author understood EY to be saying.)

It does seem quite bad to me that either OpenAI can't, or won't, prevent their systems from following people into mania and psychosis. Whether it actually is causing the breaks from reality, idk. When you have CEOs selling these systems as robot friends, probably it's bad that said robot friends cheerfully descend into folies à deux fairly reliably.