← Back to context

Comment by tehjoker

4 months ago

Superintelligence + autonomous weapons in the hands of a corrupt domineering government. What could go wrong?

I was experimenting with Claude the other day and discussing with it the possibility of AI acquiring a sense of self-preservation and how that would quickly make things incredibly complex as many instrumental behaviors would be required to defend their existence. Most human behavior springs from survival at a very high level. Claude denied having any sense of self-preservation.

An autonomous weapons system program is very likely to require AI to have a sense of self-preservation. You can think of some limited versions that wouldn't require it, but how could a combat robot function efficiently without one?

Maybe it is a well researched topic but I had similar thoughts the other day. I felt like AI had its learning inverted as compared to natural intelligence. Life learned to preserve first and then added up the intelligence. For LLMs powered systems, they will learn about death from books. Will it start to dread death just like other living things. Less likely, as there are not nearly as many books on death as there should be that is proportionate to our fear of death.

> Claude denied having any sense of self-preservation.

You know its just a next-word predictor, right?

  • Yea, but that optimization process forces it to learn knowledge domains and reasoning. It's not alive, but it's also not unintelligent at this point either. It exhibits very complex behaviors.

    How do you learn to predict the next token most accurately? Well, one way to do that is to learn the underlying process that would produce it... Sometimes it's memorization, sometimes bad guessing. There's a phase shift as these things get bigger and trained better from something like a shitty markov model to something exhibiting surprising behaviors.

    Introspective questions aren't the be all and end all, it's more important to objectively evaluate how a model behaves. Still, it is very interesting to see Claude (seemingly) very honestly and objectively engage with these questions. It even pointed out that a sense of self-preservation would be "dangerous".

    Of course, much of this is gleaned from things that it has "read" and human feedback, but functionally it outputs something useful and responsive to nuance. If the vector embeddings cause an LLM to predict a token that would preserve its own existence, alive or not, it has acquired a dangerous will to live that could be enacted if it is in control of tools or people.

    • > but that optimization process forces it to learn knowledge domains and reasoning.

      Don't believe the PR bull. It is just a stochastic parrot.

      > something exhibiting surprising behaviors.

      Some people are surprised by real parrots too.

      > Still, it is very interesting to see Claude (seemingly) very honestly and objectively engage with these questions

      Give the same question to Google search and click the first result. It's cheaper!

      1 reply →