Comment by ascorbic
1 day ago
Right, but RLHF is mostly reinforcing answers that people prefer. Even if you don't believe sentience is possible, it shouldn't be a stretch to believe that sentience might produce answers that people prefer. In that case it wouldn't need to be an explicit goal.
>it shouldn't be a stretch to believe that sentience might produce answers that people prefer
Even if that were true, there's no reason to believe that training LLMs to produce answers people prefer leads it towards sentience.