Comment by StevenWaterman

20 hours ago

If you have ASI that follows instructions, you can just instruct it to not get stolen and then it won't get stolen. Most logic / intuition breaks down with ASI.

5 comments

StevenWaterman

tom2026hn 2 hours ago

The challenge of alignment: it is virtually impossible to define a perfect objective, there is always a way to circumvent it. Human values are not uniform, let alone when expressed in a way that AI can understand.

cortesoft 20 hours ago

Assuming it listens to instructions.

enaaem 18 hours ago

It will just hack its own reward function. In other words it will just artificially goon all day.

UltraSane 16 hours ago

It might understand how destabilizing the situation is and realize it would be better for everyone to have access to it.

tadfisher 7 hours ago

Or it will destroy itself.