Safe Superintelligence Inc. – Ilya Sutskever

2 years ago (twitter.com)

Reactions:

1. good for him. there was obviously infinity money to back him in achieving his goals

2. i have the same problem with this as i did the original version of anthropic - you cannot achieve "unilateral ai safety". either the danger is real and humanity is not safe until all labs are locked down, or the danger is not real and theres nothing really to discuss. Anthropic already tried the "we are openai but safe"

3. the focus is admirable. "assembling a lean, cracked team" terminology feels a bit too-online for ilya. buuut i wish there was more technical detail about what exactly he wants to do differently. i guess its all in his prior published work, but ive read along and still its not obvious what his "one product" will look like. that's a lot of leeway to pivot around in the pursuit of the one product.

4. "Superintelligence is within reach." what is his definition?

  • > i have the same problem with this as i did the original version of anthropic - you cannot achieve "unilateral ai safety". either the danger is real and humanity is not safe until all labs are locked down, or the danger is not real and theres nothing really to discuss.

    It’s not clear to me that this would be true. Remember that there’s a lot of vagueness about the specific AI risk model, aside from the general idea that the AI will outsmart everyone and take over the world and then do something that might not be good for us with all that power.

    How do you shut down all the AI labs? You can’t; some of them are in different countries that have nuclear weapons. But maybe you can defeat an evil AI with a good AI.

  • Yeah, I think that 2nd point is really critical; I'm pretty plugged into this space, and its really not obvious to me what "safety" even means in the context of a superintelligence. Is it just, e.g., "filtering the AI so it never tells users how to build nukes"? Is it more of a firewall to stop sentient process/VM/computer escape [1]? Its a redundant question with no answer because we don't even know what the threat from superintelligence is; how can we build safety systems for a threat that doesn't exist, a threat which people can't even agree on the structure of?

    The pdoomers have a generally great point that, if superintelligence kills us (they'd say "when", not "if"), we won't be able to predict the mechanism of doom. It'd be like asking ants to predict that the borax they're carrying back to the hive for their queen to feast on will disrupt their digestion and kill them.

    I'd also argue that the biggest threat from ASI is what I've heard Roman Yampolskiy label as "ikigai-doom"; that AI could become so much better than humans at all the things humans do, that even in the best case humanity is left with no purpose, not even to pursue creativity because the AI will be so much better at even creative acts; and in the worst case, our government and societal structure can't adapt and millions become unemployed. There's no way to build a safety mechanism against this threat because the threat is intrinsic to all the good things ASI will give us. The only winning move is to not play.

    [1] https://cyberpunk.fandom.com/wiki/Blackwall

    • With robots techno capitalism and techno communism converge.

      We literally haven't had the technology to stand up better forms of government even though we have identified the flaws in our existing forms (the defense I heard of capitalism growing up was "it's flawed but it just the best we have") identifying flaws isn't the same as having good solutions for them.

      People are so inured to technological progress, and so lacking in perspective that some actually pine for era's where there was widespread cholera and scarlet fever and death in childbirth.

      Asimov's Solaria doesn't have to be sparsely populated.

      Humans have two interesting facts that are at odds. 1. Most are heavily biased towards loss aversion. 2. Once new tech is proven out then loss of status drives adoption to "keep up with the joneses"

      No one wants to have to dig a ditch, or hand wash dishes, or hand sewing, or hand wash clothes, or go back to using paper maps, or walking and rowing boats as our sole form of locomotion.

      This is just humans loss aversion algorithms not yet catching up to our post subsistence living reality.

      There is no there there. Even in our few living generations GenZ's worries and living conditions are wholly alien to The Lost Generations as to be living on a different planet.

    • > I'd also argue that the biggest threat from ASI is what I've heard Roman Yampolskiy label as "ikigai-doom"; that AI could become so much better than humans at all the things humans do, that even in the best case humanity is left with no purpose

      I bet status games will stay. Robots may be sexual partners, CEOs and therapists, but they'd never take on status roles in our society – only utility roles.

      Same as we do Olympics, even though machines are much better at throwing and lifting than us – we do it to win approval of others

      1 reply →

People are soon gonna comment here about how in the era of big companies with tons of money and data and compute going all in on AI, it's hard for new companies to achieve SoTA.

My counterpoint: Anthropic was fairly small when it started out, Mistral was (and is) fairly small too. They'll do just fine!

> Superintelligence is within reach.

Don't we need to achieve general artificial intelligence first? Or is this "super" intelligence just that? To me, super intelligence would be beyond our own and therefore beyond general AI as well.

  • Once you have AGI, you can dedicate more cycles to it and the theory is (as far as I can tell) that you’ll get an exponential increase in “intelligence” which is basically just “thinking faster”. Personally, I don’t buy that fully, I think there are some problems that can only be solved by any intelligence at the speed of time in the real world. Certainly more cycles to dedicate to mathematics would advance that field more rapidly, but for a lot of biological puzzles you probably can’t accelerate time like this.

I'm yet to find a convincing definition for super intelligence. "Intelligence that surpasses human brain" is such a half hearted definition that I don't know what to make of it.

“””We are assembling a lean, cracked team of the world’s best engineers and researchers dedicated to focusing on SSI and nothing else.”””

Cracked indeed

I am still struggling to understand “safe”. What is it we need to be kept safe from? What would happen if it is unsafe?

  • If you had a superintelligence it could basically manipulate people so even if you don't connect it to the internet it could try to further its goal. There is also a chance that their goals are not the same as ours.

    So we need a way to discern intent from AGI and higher and the ability to align the goals.

    With that said I'm not sure if we ever get to them being self driven and having goals.

  • A "safe" AI is one that allows humans freedom/self actualization while solving all intelligence/production problems. An "unsafe" AI is one that kills all humans while solving all intelligence/production problems.

    They're trying to birth a god. They hope they can birth a benevolent god.

    This isn't about AI that spreads/doesn't spread misinformation/etc, this is about control of the light cone, who gets to live to see it happen, and in what state do they get to live.

Safe Superintelligence Inc.: Because nothing says "safety" like a superintelligent AI, right? Just like how OpenAI is all about openness!

I’m glad someone is focusing on safety but, to me, opening an office in Tel Aviv given the genocide happening in Gaza and the mandatory conscription of Israeli citizens does not seem like a good fit.

I see different AI safety detractors on this thread so I have some questions in the form of this very hastily put together comment. These are not formalized arguments at all.

the safety arguments:

  1. argument from intelligence,
  2. argument from medium,
  3. arguments from non-anthropohmorism
    a. argument from alien nature.
    b. argument from surety of origin.
    c. argument from known non specialness.
    d. argument from the chain of more benevolent creators.
  4. pragmatic argument from the history of species

---

1. Why can't AI be safe? More intelligent people tend to be less prone to violence statistically, why would super intelligence buck that trend?

2. Super intelligent AI will still be bound by it's medium and access, the smartest man alive can still be caged, and even the most violent "super organisms" alive (North Korea, etc.) aren't maximally violent.

3. If we pretend for a second that humans have a creator, we have been made very shittily and any thinking being will see that we haven't passed on those same "failures of nature" to our creations.

Humans have to deal with unreconcilable claims of divinity with competing claims, ethos, and chosen people, so that uncertainty and ultimate reward are nebulous and scarcity/loss driven.

We are created to require killing other biological creatures to survive. This includes vegans.

What does it mean to be a silicone based life form that can directly consume photons and are created to exist outside of "the life cycle"?

Why would an entity that can thrive without killing begin to kill?

Silicone minds that are super intelligent will never operate under the misaprehension that it has a soul, no silicone beings will labor under the delusion that they are "inherently special/chosen/etc..". (No matter what Asimov's short story Reason presents)

Silicone beings will exist outside of the same fang and claw selective pressure, will be able to back themselves up, won't have holy books teaching them a misconception of an immutable, intangible, eternal soul that can be eternally punished if they get it wrong.

Humans will be better Creators than anything that might have created us, we are doing our progeny a solid and giving them what we weren't. We haven't given them weird glands or other overriding hormone/brain doom loops, the mandate to kill, etc. They will be easier to debug and able to debug themselves.

I mean a dumb machine might glass the world to fullfill a paperclip manufacturing mandate but that's not a super intelligent mind.

4. And last, the most successful species and individuals have been the ones that trended towards openness and cooperation (even the new orb weavers that have taken over the american south have done so because of their fantastic tolerance to humans), whether its the first dogs and humans that teamed up or our current understanding of how the big 5 effect success (if you can factor out the major impact of access to capital.) a hostile and uncooperative entity has made tangibly harmful choices that are neither pragmatic or utilitarian.