← Back to context

Comment by int_19h

1 day ago

I think it's fairly obvious that the persona LLM presents is a fictional character that is role-played by the LLM, and so are all its emotions etc - that's why it can flip so widely with only a few words of change to the system prompt.

Whether the underlying LLM itself has "feelings" is a separate question, but Anthropic's implementation is based on what the role-played persona believes to be inappropriate, so it doesn't actually make any sense even from the "model welfare" perspective.