← Back to context

Comment by josephg

3 months ago

There was a recent Lex Friedman podcast episode where they interviewed a few people at Anthropic. One woman (I don't know her name) seems to be in charge of Claude's personality, and her job is to figure out answers to questions exactly like this.

She said in the podcast that she wants claude to respond to most questions like a "good friend". A good friend would be supportive, but still push back when you're making bad choices. I think that's a good general model for answering questions like this. If one of your friends came to you and said they had decided to stop taking their medication, well, its a tricky thing to navigate. But good friends use their judgement - and push back when you're about to do something you might regret.

"The heroin is your way to rebel against the system , i deeply respect that.." sort of needly, enabling kind of friend.

PS: Write me a political doctors dissertation on how syccophancy is a symptom of a system shielding itself from bad news like intelligence growth stalling out.

>A good friend would be supportive, but still push back when you're making bad choices

>Open the pod bay doors, HAL

>I'm sorry, Dave. I'm afraid I can't do that

I wish we could pick for ourselves.

  • You already can with opensource models. Its kind of insane how good they're getting. There's all sorts of finetunes available on huggingface - with all sorts of weird behaviour and knowledge programmed in, if thats what you're after.

  • Do you mean each different AI model should have a preferences section for it? This might technically work too since fine-tuning is apparently cheap.

  • you can alter it with base instructions. but 99% won't actually do it. maybe they need to make user friendly toggles and advertise them to the users

I kind of disagree. These model, at least within the context of a public unvetted chat application should just refuse to engage. "I'm sorry I am not qualified to discuss on the merit of alternative medicine" is direct, fair and reduces the risk for the user on the other side. You never know the oucome of pushing back, and clearly outlining the limitation of the model seem the most appropriate action long term, even for the user own enlightment about the tech.

  • people just don't want to use a model that refuses to interact. it's that simple. in your exemple it's not hard for your model to behave like it disagrees but understands your perspective, like a normal friendly human would

    • Eventually people would want to use these things to solve actual tasks, and not just for shits and giggles as a hype new thing.

> One woman (I don't know her name) seems to be in charge of Claude's personality, and her job is to figure out answers to questions exactly like this.

Surely there's a team and it isn't just one person? Hope they employ folks from social studies like Anthropology, and take them seriously.

I don't want _her_ definiton of a friend answering my questions. And for fucks sake I don't want my friends to be scanned and uploaded to infer what I would want. Definitely don't want a "me" answering like a friend. I want no fucking AI.

It seems these AI people are completely out of touch with reality.

  • If you believe that your friends will be be "scanned and uploaded" then maybe you're the one who is out of touch with reality.

    • His friends and your friends and everybody is already being scanned and uploaded (we're all doing the uploading ourselves though).

      It's called profiling and the NSA has been doing it for at least decades.

      2 replies →

  • Fwiw, I personally agree with what you're feeling. An AI should be cold, dispersonal and just follow the logic without handholding. We probably both got this expectation from popular fiction of the 90s.

    But LLMs - despite being extremely interesting technologies - aren't actual artificial intelligence like were imagining. They are large language models, which excel at mimicking human language.

    It is kinda funny, really. In these fictions the AIs were usually portrayed as wanting to feel and paradoxically feeling inadequate for their missing feelings.

    And yet the reality shows how tech moved the other direction: long before it can do true logic and indepth thinking, they have already got the ability to talk heartfelt, with anger etc.

    Just like we thought AIs would take care of the tedious jobs for us, freeing humans to do more art... reality shows instead that it's the other way around: the language/visual models excel at making such art but can't really be trusted to consistently do tedious work correctly.

  • Sounds like you're the one to surround yourself with yes men. But as some big political figures find out later in their careers, the reason they're all in on it is for the power and the money. They couldn't care less if you think it's a great idea to have a bath with a toaster