Comment by andrewflnr

2 days ago

They make a big show of being "unsure" about the model having a moral status, and then describe a bunch of actions they took that only make sense if the model has moral status. Actions speak louder than words. This very predictably, by obvious means, creates the impression of believing the model probably has moral status. If Anthropic really wants to tell us they don't believe their model can feel pain, etc, they're either delusional or dishonest.

1 comment

andrewflnr

xpe 2 days ago

> They make a big show of being "unsure" about the model having a moral status, and then describe a bunch of actions they took that only make sense if the model has moral status.

I think this is uncharitable; i.e. overlooking other plausible interpretations.

>> We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future. However, we take the issue seriously, and alongside our research program we’re working to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible.

I don’t see contradiction or duplicity in the article. Deciding to allow a model to end a conversation is “low cost” and consistent with caring about both (1) the model’s preferences (in case this matters now or in the future) and (2) the impacts of the model on humans.

Also, there may be an element of Pascal‘s Wager in saying “we take the issue seriously”.