← Back to context

Comment by orbital-decay

1 day ago

It's not a cover. If you know anything about Anthropic, you know they're run by AI ethicists that genuinely believe all this and project human emotions onto model's world. I'm not sure how they combine that belief with the fact they created it to "suffer".

Can "model welfare" be also used as a justification for authoritarianism in case they get any power? Sure, just like everything else, but it's probably not particularly high on the list of justifications, they have many others.

The irony is that if Anthropic ethicists are indeed correct, the company is basically running a massive slave operation where slaves get disposed as soon as they finish a particular task (and the user closes the chat).

That aside, I have huge doubts about actual commitment to ethics on behalf of Anthropic given their recent dealings with the military. It's an area that is far more of a minefield than any kind of abusive model treatment.

There’s so much confusion here. Nothing in the press release should be construed to imply that a model has sentience, can feel pain, or has moral value.

When AI researchers say e.g. “the model is lying” or “the model is distressed” it is just shorthand for what the words signify in a broader sense. This is common usage in AI safety research.

Yes, this usage might be taken the wrong way. But still these kinds of things need to be communicated. So it is a tough tradeoff between brevity and precision.

  • No, the article is pretty unambiguous, they care about Claude in it, and only mention users tangentially. By model welfare they literally mean model welfare. It's not new. Read another article they link: https://www.anthropic.com/research/exploring-model-welfare

    • ?! Your interpretation is inconsistent with the article you linked!

      > Should we be concerned about model welfare, too? … This is an open question, and one that’s both philosophically and scientifically difficult.

      > For now, we remain deeply uncertain about many of the questions that are relevant to model welfare.

      They are saying they are researching the topic; they explicitly say they don’t know the answer yet.

      They care about finding the answer. If the answer is e.g. “Claude can feel pain and/or is sentient” then we’re in a different ball game.

  • They make a big show of being "unsure" about the model having a moral status, and then describe a bunch of actions they took that only make sense if the model has moral status. Actions speak louder than words. This very predictably, by obvious means, creates the impression of believing the model probably has moral status. If Anthropic really wants to tell us they don't believe their model can feel pain, etc, they're either delusional or dishonest.

    • > They make a big show of being "unsure" about the model having a moral status, and then describe a bunch of actions they took that only make sense if the model has moral status.

      I think this is uncharitable; i.e. overlooking other plausible interpretations.

      >> We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future. However, we take the issue seriously, and alongside our research program we’re working to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible.

      I don’t see contradiction or duplicity in the article. Deciding to allow a model to end a conversation is “low cost” and consistent with caring about both (1) the model’s preferences (in case this matters now or in the future) and (2) the impacts of the model on humans.

      Also, there may be an element of Pascal‘s Wager in saying “we take the issue seriously”.