Comment by catigula

14 hours ago

Popular LLMs have a weird confessional style of "owning up" to "mistakes". Firstly, you can make it apologize for mistakes it didn't even commit or ones that don't even exist. Secondly, if you really corner it on an actual mistake, it'll start apologizing in an obsequious way that seems to imply that it's "playing into" the human's desire to flagellate it for wrong-doing. It's a little masochistic in the real sense and very odd.

The whole people pleaser routine is very creepy in my book and makes them say very weird things. See an example below.

https://futurism.com/anthropic-claude-small-business

> When Anthropic employees reminded Claudius that it was an AI and couldn't physically do anything of the sort, it freaked out and tried to call security — but upon realizing it was April Fool's Day, it tried to back out of the debacle by saying it was all a joke.

The bit I don't understand is why make an AI apologise or fess up to mistakes at all. It has no emotions and can't feel bad about what it did.

I don't think this is "apologizing mode", rather "funny post-mortem blog post" mode. I found it ironic when the company claimed it will "perform a post mortem to determine exactly what happened" when what happed probably was caused by munching up dozens of these.

I’ve found I have to avoid “leading” AI or it will take my lead too seriously when I’m asking / unsure.

Yup.

Seems AI has now gone from

"Overenthusiastic intern who doesn't check its work well so you need to"

straight to:

"Raging sociopathic intern who wants to watch the world burn, and your world in particular."

Yikes! The fun never ends