Comment by Lerc

1 month ago

When Asimov wrote those works there was optimism that Symbolic artificial intelligence would provide the answers.

>But the specific issues we are dealing with have little to do with us feeling safe and protected behind some immutable rules that are built into the system

If your interpretation of the Robot books was that was suggesting a few immutable rules would make us safe and protected, you may have missed the primary message. The overarching theme was an exploration of what those laws could do, and how they may not necessarily correlate with what we want or even perceive as safe and protected. If anything the rules represented a starting point and the books were presenting a challenge to come up with something better.

Anthropic's work on autoencoding activations down to measurable semantic points might prove a step towards that something better. The fact that they can do manipulations based upon those semantic points does suggest something akin to the laws of robotics might be possible.

When it comes to alignment, the way many describe it, it is simply impossible because humans themselves are not aligned. Picking a median, mean, or lowest common denominator of human alignment would be a choice that people probably cannot agree. We are unaligned on even how we could compromise.

In reality, if you have control over what AI does there are only two options.

1. We can make AI do what some people say,

2. We can make them do what they want (assuming we can make them want)

If we make them do what some people, that hands the power to those who have that say.

I think there will come a time when an AI will perceive people doing something wrong, that most people do not think is wrong, and the AI will be the one that is right. Do we want it to intervene or not? Are we instead happy with a nation developing superintelligence that is subservient to the wishes of say, Vladimir Putin.

1 comment

Lerc

cheschire 1 month ago

As I alluded to earlier, to me the books were more an exploration into man’s hubris to think control could be asserted by failed attempts to distill spoken and unspoken human rules into a few “laws”.

Giskard and Daneel spend quite a lot of time discussing the impenetrable laws that govern human action. That sounds more like what is happening in the current frontier of AI than mechanical trains of thought that only have single pathways to travel, which is closer to how Asimov described it in the Robots books.

Edit: I feel like I’m failing to make my point clearly here. Sorry. Maybe an LLM can rephrase it for me. (/s lol)