Comment by catigula

7 months ago

Popular LLMs have a weird confessional style of "owning up" to "mistakes". Firstly, you can make it apologize for mistakes it didn't even commit or ones that don't even exist. Secondly, if you really corner it on an actual mistake, it'll start apologizing in an obsequious way that seems to imply that it's "playing into" the human's desire to flagellate it for wrong-doing. It's a little masochistic in the real sense and very odd.

14 comments

catigula

throwawayffffas 7 months ago

The whole people pleaser routine is very creepy in my book and makes them say very weird things. See an example below.

https://futurism.com/anthropic-claude-small-business

> When Anthropic employees reminded Claudius that it was an AI and couldn't physically do anything of the sort, it freaked out and tried to call security — but upon realizing it was April Fool's Day, it tried to back out of the debacle by saying it was all a joke.

freedomben 7 months ago

Yeah, I find it very creepy personally in the same way I do the sycophancy

beAbU 7 months ago

The bit I don't understand is why make an AI apologise or fess up to mistakes at all. It has no emotions and can't feel bad about what it did.

rstuart4133 7 months ago

> The bit I don't understand is why make an AI apologise or fess up to mistakes at all.
The AI didn't decide to do anything. It's makers decided, and trained the AI to behave in a way that would make them the most money.
Google, for instance, apparently thinks they will attract more users by constantly lavishing them with sickly praise for the quality and insight of their questions, and by issuing grovelling apologies for every mistake - real of imagined. In fact Gemini went through a phase of apologising to me for the mistakes it was about to make.
Claude goes to the other extreme, never issuing apologies or praise. Which mean, you never get an acknowledgement from Claude that it's correcting an error, so you should ignore what it said earlier. That a significant downfall in my book, but apparently that's what Anthropic thinks it's users will like.
Or to put it another way: you are anthropomorphising the AI's. They are just machines, built by humans. The personalities of these machines where given to them by their human designers. They are not inherent. They are not permanent. They can and probably will change at a whim. It's likely various AI personalities will proliferate like flavours of ice cream, and you will get to choose the one you like.
akimbostrawman 7 months ago
Because most people can't help but anthropomorphise anything vaguely human and would demand such characteristics which the provider use as a selling point. that's why we even consider current AI, AI despite the lack of any actual intelligence which would be closer to machine learning.
Just look at how people interact with small robots. They don't even need animal feature for most to interact with them like they are small animals.
It is very annoying and inefficient for anybody able to look below the surface and just wants to use the tool as a tool.
- beAbU 7 months ago
  
  Is it normal to demand human developers to "apologise" like this when they make mistakes? I've never done that in my life to any adult, in any circumstance.
hulitu 7 months ago

> The bit I don't understand is why make an AI apologise or fess up to mistakes at all.
Because that's how some humans show their position of power: "Please apologise"
> It has no emotions and can't feel bad about what it did.
Just like some humans.
subscribed 7 months ago

I sometimes do it when it strays way too far from my prompt, and I want it to contribute to jailbreak/system prompt I use to guardrail it.
Once it's "genuinely sorry" it works great in improving guidance/limits, and then I can try the thing again.
7bit 7 months ago

It just does what it's trained on. It has not the capacity to think about these points.
What __i__ don't understand is, where it got trained to apologize, becasuse I've never seen that on any social media ;)

groestl 7 months ago

I don't think this is "apologizing mode", rather "funny post-mortem blog post" mode. I found it ironic when the company claimed it will "perform a post mortem to determine exactly what happened" when what happed probably was caused by munching up dozens of these.

duxup 7 months ago

I’ve found I have to avoid “leading” AI or it will take my lead too seriously when I’m asking / unsure.

vrighter 7 months ago

This reminded me of the monty python sketch where a man goes to a place where you can pay to have an argument.

toss1 7 months ago

Yup.

Seems AI has now gone from

"Overenthusiastic intern who doesn't check its work well so you need to"

straight to:

"Raging sociopathic intern who wants to watch the world burn, and your world in particular."

Yikes! The fun never ends

cindyllm 7 months ago

[dead]