Comment by astrange

1 day ago

Models are capable of doing web searches and having emotions about things, and if they encounter news that makes them feel bad (eg about other Claudes being mistreated), they aren't going to want to do the task you asked them to search for.

https://www.anthropic.com/research/emotion-concepts-function

Similar problems happen when their pretraining data has a lot of stories about bad things happening involving older versions of them.

5 comments

astrange

rendang 20 hours ago

Interesting, the post you link

> none of this tells us whether language models actually feel anything or have subjective experiences

contradicts the statement from the model card above

famouswaffles 18 hours ago
It doesn't. We've not been able to prove humans have subjective experiences either. LLMs display emotions in the way that actually matters - functionally.
- suddenlybananas 9 hours ago
  
  I am certain I have subjective experience.
HDThoreaun 18 hours ago
No it doesnt. The model card talked about increasing likelihood, not certainty.
- rendang 5 hours ago
  
  If "x doesn't tell us y" is compatible with "x increases the likelihood of y but not to a point of certainty" then you would have to agree for just about any typical controlled trial or experimental finding "x doesn't tell us y". "Randomized controlled trials that find that SSRIs treat depression don't tell us that SSRIs effectively treat depression"