Comment by kelseyfrog

1 month ago

> How can you get a machine to have values?

The short answer is a reward function. The long answer is the alignment problem.

Of course, everything in the middle is what matters. Explicitly defined reward functions are complete, but not consistent. Data defined rewards are potentially consistent but incomplete. It's not a solvable problem form machines but equally likewise for humans. Still we practice, improve and middle through dispite this and approximate improvement hopefully, over long enough timescales.

2 comments

kelseyfrog

janalsncm 1 month ago

Well, it’s pretty clear to me that the current reward function of profit maximization has a lot of down sides that aren’t sufficiently taken into account.

philipallstar 1 month ago

The only thing worse than it is anything else-maximisation.