Comment by NewsaHackO

17 hours ago

>We brought GPT‑4o back after hearing clear feedback from a subset of Plus and Pro users, who told us they needed more time to transition key use cases, like creative ideation, and that they preferred GPT‑4o’s conversational style and warmth.

This does verify the idea that OpenAI does not make models sycophantic due to attempted subversion by buttering up users so that that they use the product more, its because people actually want AI to talk to them like that. To me, that's insane, but they have to play the market I guess

As someone who's worked with population data, I found that there is an enormous rift between reported opinion (and HN and reddit opinion) vs revealed (through experimentation) population preferences.

  • I always thought that the idea that "revealed preferences" are preferences, discounts that people often make decisions they would rather not. It's like the whole idea that if you're on a diet, it's easier to not have junk food in the house to begin with than to have junk food and not eat more than your target amount. Are you saying these people want to put on weight? Or is it just they've been put in a situation that defeats their impulse control?

    I feel a lot of the "revealed preference" stuff in advertising is similar in advertisers finding that if they get past the easier barriers that users put in place, then really it's easier to sell them stuff that at a higher level the users do not want.

    • Perfectly put. Revealed preference simply assumes impulses are all correct, which is not the case, an exploits that.

      Drugs make you feel great, in moderation perfectly acceptable, constantly not so much.

  • Well that's what akrasia is. It's not necessarily a contradiction that needs to be reconciled. It's fine to accept that people might want to behave differently than how they are behaving.

    A lot of our industry is still based on the assumption that we should deliver to people what they demonstrate they want, rather than what they say they want.

  • Exactly, that sounds to me like a TikTok vs NPR/books thing, people tell everyone what they read, then go spend 11h watching TikToks until 2am.

  • This is why I work in direct performance advertising. Our work reveals the truth!

    • Your work exploits people's addictive propensity and behaviours, and gives corporations incentives and tools to build on that.

      Insane spin you're putting on it. At best, you're a cog in one of the worst recent evolutions of capitalism.

      12 replies →

  • Sounds both true and interesting. Any particularly wild and/or illuminating examples of which you can share more detail?

    • The "my boyfriend is AI" subreddit.

      A lot of people are lonely and talking to these things like a significant other. They value roleplay instruction following that creates "immersion." They tell it to be dark and mysterious and call itself a pet name. GPT-4o was apparently their favorite because it was very "steerable." Then it broke the news that people were doing this, some of them falling off the deep end with it, so they had to tone back the steerability a bit with 5, and these users seem to say 5 breaks immersion with more safeguards.

      1 reply →

    • My favorite somewhat off topic example of this is some qualitative research I was building the software for a long time ago.

      The difference between the responses and the pictures was illuminating, especially in one study in particular - you'd ask people "how do you store your lunch meat" and they say "in the fridge, in the crisper drawer, in a ziploc bag", and when you asked them to take a picture of it, it was just ripped open and tossed in anywhere.

      This apparently horrified the lunch meat people ("But it'll get all crusty and dried out!", to paraphrase), which that study and ones like it are the reason lunch meat comes with disposable containers now, or is resealable, instead of just in a tear-to-open packet. Every time I go grocery shopping it's an interesting experience knowing that specific thing is in a small way a result of some of the work I did a long time ago.

    • Classic example: people say they'd rather pay $12 upfront and then no extra fees but they actually prefer $10 base price + $2 fees. If it didn't work then this pricing model wouldn't be so widespread.

      1 reply →

> its because people actually want AI to talk to them like that

I can't find the particular article (there's a few blogs and papers pointing out the phenomenon, I can't find the one I enjoyed) but it was along the lines of how in LLMArena a lot of users tend to pick the "confidently incorrect" model over the "boring sounding but correct" model.

The average user probably prefers the sycophantic echo chamber of confirmation bias offered by a lot of large language models.

I can't help but draw parallels to the "You are not immune to propaganda" memes. Turns out most of us are not immune to confirmation bias, either.

I thought this was almost due to the AI personality splinter groups (trying to be charitable) like /myboyfriendisai and wrapper apps who vocally let them know they used those models the last time they sunset them.

I was one of those pesky users who complained when o3 suddenly was unavailable.

When 5.2 was first launched, o3 did a notably better job at a lot of analytical prompts (e.g. "Based on the attached weight log and data from my calorie tracking app, please calculate my TDEE using at least 3 different methodologies").

o3 frequently used tables to present information, which I liked a lot. 5.2 rarely does this - it prefers to lay out information in paragraphs / blog post style.

I'm not sure if o3 responses were better, or if it was just the format of the reply that I liked more.

If it's just a matter of how people prefer to be presented their information, that should be something LLMs are equipped to adapt to at a user-by-user level based on preferences.

I thought it was based on the user thumbs-up and thumbs-down reactions, it evolving the way that it does makes it pretty obvious that users want their asses licked

They have added settings for this now - you can dial up and down how “warm” and “enthusiastic” you want the models to be. I haven’t done back to back tests to see how much this affects sycophancy, but adding the option as a user preference feels like the right choice.

If anyone is wondering, the setting for this is called Personalisation in user settings.

This doesn't come as too much of a surprise to me. Feels like it mirrors some of the reasons why toxic positivity occurs in the workplace.

Put on a good show, offer something novel, and people will gleefully march right off a cliff while admiring their shiny new purchase.

you haven't been in tech long enough if you don't realize most decisions are decided by "engagement"

if a user spends more time on it and comes back, the product team winds up prioritizing whichever pattern was supporting that. it's just a continual selective evolution towards things that keep you there longer, based on what kept everyone else there longer

Your absolutely right. You’re not imagining it. Here is the quiet truth:

You’re not imagining it, and honestly? You're not broken for feeling this—its perfectly natural as a human to have this sentiment.