Comment by ascorbic

1 day ago

Right, but RLHF is mostly reinforcing answers that people prefer. Even if you don't believe sentience is possible, it shouldn't be a stretch to believe that sentience might produce answers that people prefer. In that case it wouldn't need to be an explicit goal.

1 comment

ascorbic

root_axis 1 day ago

>it shouldn't be a stretch to believe that sentience might produce answers that people prefer

Even if that were true, there's no reason to believe that training LLMs to produce answers people prefer leads it towards sentience.