← Back to context

Comment by hamdingers

13 days ago

I wonder to what extent 4/4o is the culprit, vs it simply being the default model when many of these people were forming their "relationships."

4o had some notable problems with sycophancy being very very positive about the user and going along with almost anything the user said. OpenAI even talked about it [0] and the new responses to people trying to continue their former 'relationship' does tend towards being 'harsh' [1] especially if you were a person actually thinking of the bot as a kind of person.

[0] https://openai.com/index/sycophancy-in-gpt-4o/

[1] https://www.reddit.com/r/MyBoyfriendIsAI/comments/1qx3jux/wh...

  • It really does give a lot of signal[1] to people in the dating scene: validate and enthusiastically respond to potential romantic partners and the world is your oyster.

    1. possibly/probably not in a good or healthy way? idk

    • From the viewpoint of self psychology people are limited in their ability to seduce because they have a self. You can't maintain perfect mirroring because you get tired, their turn-on is your squick, etc. In the early stage of peak ensorcelement (limerence) people don't see the "small signals", they miss the microexpressions, sarcastic leaks, etc. -- they see what they want to see. But eventually that wears out.

      It can be puzzling that people fall for "romance scams" with people whose voice they haven't even heard but actually it's actually a safer space for that kind of seducer to operate because the low-fi channel avoids all sort of information leaks.

Anecdotally, 4o's sycophancy was higher than any other model I've used. It was aggressively "chat-tuned" to say what it thought the user wanted to hear. The latest crop of frontier models from OpenAI and others seems to have significantly improved on this front — does anybody know of a sycophancy benchmark attempting to quantify this?

  • If I worked at OpenAI, I would dial up the sycophancy to lock my users in right before raising subscription prices.

    • That's... a strategy. Matter of time before an AI companion company succeeds with this by finetuning one of the open-source offerings. Cynically I'm sure there are at least a few VC backed startups already trying this

      1 reply →

It's not that complicated. 4o was RLHF'd to be sycophantic as hell, which was fine until some one had their psychotic episode fueled by it and so they changed it with the next model.

  • Not just someone, many, many people, going by the feedback on Reddit. People are mourning the damn thing.

    Grossly irresponsible to ever release this IMO.

Never used 4o in an unhealthy way, but the audio was so much fun (especially for cooking help). Almost essentially quit using AI audio since. Nothing compares.

I think that's part of it, but then the user perceives "personality changes" when the model changes due to differences in the model. Now they have lost their relationship because of the model change.