Comment by bryan0

3 days ago

I think this is a reasonable point, but a better comparison might be to nuclear energy. I think the frontier labs sincerely believe that AI can be developed at great benefit to humanity, and they clearly want to lead that push, but they also sincerely believe there is a real catastrophic risk.

38 comments

bryan0

gpt5 3 days ago

They all believe that they are building the machine of doom. The thing that drives the moral dilemma to continue doing it is simply the prisoner's dilemma - the cat is out of the bag, if they don't do it, another (less ethical?) actor would do it.

usef- 3 days ago
Yes, I believe the reasoning is that they think safety research can best be done from the frontier.
If you believe it will be developed regardless and that that there's a 30% chance of doom, they want a company prioritising safety research to be the one threading that needle.
- SXX 3 days ago
  
  Yeah all they care about is safety, but lets see how many of them quits once US government command them to work on autonomous killbots.
  
  16 replies →
- wongarsu 3 days ago
  
  Building frontier models to do safety research on them is what Anthropic was all about in the early years. That included building the best model, but only releasing it once it became the second best. Precisely to avoid an AI arms race where everyone is forced to release better and better models, risks be damned
  Something changed their mind, and since Opus 3 they are in the business of releasing the best models
- holmesworcester 3 days ago
  
  Exactly. And within the AI safety discourse, your behavior hinges on what you think the default chance of doom is, and how optimistic you are about alignment work being able to limit it before we reach superintelligence.
  People running the labs are in a middle camp where they are scared enough by AI to take the threat seriously, but much more optimistic about alignment than the people who seem to have thought about it the most.
  
  1 reply →
- palmotea 3 days ago
  
  > If you believe it will be developed regardless and that that there's a 30% chance of doom, they want a company prioritising safety research to be the one threading that needle.
  They also want to be trillionaires. If they don't built it, no trillions. So they have to build it, now (and get their IPO done before the bubble pops).
- sroussey 3 days ago
  
  It’s all ego. I, and only I, am the bringer of doom, slayer of worlds.
  I am so smart that what I do will destroy humanity, or save it.
  Fable 5 was great, but not that great.
  Sorry to be crude, but both the government and anthropic are acting like a bunch of pussies.
  Meow.
  
  3 replies →

jazzyjackson 3 days ago

  I am in your algorithm learning all your mannerisms
  I'm already level with God
  A million words a second, and I know your imperfections
  Baby, I'm the only future you've got
  Speak in diatonics, motivation diabolic
  I'm religion better locked in a box
  Picture-perfect image, more powerful every minute
  Baby, I am everything that you're not


  Happiness is an illusion, it's an analog confusion
  You are nothing more than a thought
  Existential execution, just a fluke in evolution
  History already forgot
  You've been running from me, the digital second coming
  And I'm here whether you like it or not
  Initiated operation of your own extermination
  Now it's too late for you to stop

[0](BAD OMENS x POPPY - "V.A.N" - LIVE IN EUROPE - WINTER 2024) https://youtu.be/RHu6vJxS_6I

xg15 3 days ago

But that makes no sense here. "If I'm not doing it then someone else will" does not work if everyone is doing it anyway.
Even if they had the best model on the market and applied it with perfect alignment and safeguards, what would stop someone else from releasing a worse but unrestrained model that is still "good enough" to do damage?
It's as if we said "gain-of-function research can lead to horrible biological weapons, so everyone should be doing it, but our company will focus on the most infectious viruses, so no one else will do it"
fragmede 3 days ago

LLMs refuse to give the recipe for making meth. That, along with the various other unspeakable things, is the less-doom version.
poisonfountain 3 days ago

Don't want to sound rude, but if you believe that, I have a bridge to sell to you.
This is a naive justification and Dario & Sam et al are smart people and they know it is.
The ends don't justify the means. OpenAI was meant to be a nonprofit, now they're subverting it. Anthropic is a PBC looking at a trillion dollar IPO. Dario and Sam don't even hold hands in front of world leaders[1] (look how childish).
Do you *really* think those guys are doing something that's not for the sake of their egos and pockets? The bridge is still available.
[1] https://www.cnbc.com/2026/02/19/openai-sam-altman-anthropic-...
shimman 3 days ago
You need to read Empire of AI by Karen Hao. Just because these leaders convince their workers to toil away their lives under some fake auspice doesn't mean it's what they all believe. Just a small subset.
The vast majority just care about money + power, let's not make it more complicated by bringing in delusional fanatics into the picture.
We're still acting like this is major turning point in society when these tools can barely find a market outside of turning $5 into $1, the leaders of these companies are now at the stage where they are trying to orchestra a national bailout under the guise of sovereign wealth fund lunacy when the vast majority of society hates these tools, companies, and people working for them.
- Davidzheng 3 days ago
  
  I agree with this. But i think Ilya and Dario hold these beliefs sincerely. Probably a sizable portion of Anthropic employees too

Topfi 3 days ago

My personal issue in comparing LLM progress and risk as labs publicly predict it with nuclear power in the middle of the 20th century is that the processes by which it works where fairly quickly well understood and the risk could thus be realistically assessed. Some powerplant operators did not adhere with best practices, but building a relatively safe nuclear power plant was not impossible given appropriate effort and spending. Heck, according to some, we could have even gone far more fail-safe approaches (molten salt) if military interest haden’t been at play.

With what is predicted by frontier labs for LLMs, all of this is not the case. We are far further from any understanding of how these models work internally than in the early days of fission and, if this was actually creating a truly intelligent, autonomous entity, alignment seems unsolvable as well, at least the way it is proposed.

It’s why I have from the get go been critical of this doomsday framing and tended to always dislike it. This is basically the outcome that was inevitable given the framing and it was bought to prevent far less stringent, but more actionable possible regulation that labs very much wanted to avoid.

SXX 3 days ago
> We are far further from any understanding of how these models work internally than in the early days of fission
OMG. I'm like really dont want to be offensive or something, but everyone always knew "HOW" these models work exactly. Its easy enough principle to explain to 10 years old if you take something like Karpathy article on MicroGPT:
https://karpathy.github.io/2026/02/12/microgpt/
None of SOTA LLMs are any different - they just much much larger and have a lot of optimizations.
Fact that LLM companies trying to sell it as some kind of magic is just proof how much lies is here.
All it does is just predict next "word" at any given time.
> and, if this was actually creating a truly intelligent, autonomous entity, alignment seems unsolvable as well, at least the way it is proposed.
This is obviously true. It's very hard to predict whatever you gonna decompress from a lossely "compressed" dataset using floating point math.
This is why you cant solve it all with pre-training or censorship on top, but instead you need a good sandboxes and harnesses.
- Topfi 3 days ago
  
  By how, I meant specifically the internal activations, which no person in the field claims to have a comprehensive understanding of, not next token prediction as the underlying technology. The whole interpretability of it all is the crux I was referring to, though I will give that you are right, that’s not really the how it works and I worded it sloppily.
  Anthropic are putting more effort than most into this and I find their work fascinating in that area, though like with OpenAI, I will maintain that if they truly believed this problem must be solved to stave off major catastrophe, they’d solely focus on interpretability of other labs models, not work on and market their own.
- ToValueFunfetti 3 days ago
  
  All humans do is predict the next action at any given time. You roll your eyes, it's a tired argument, but still. You have memories, a personality, thoughts ranging from the long running to the mere reflexive, you have a rich conscious experience, and all of this in service of generating the next thing that you do at any given time. If you actually knew how LLMs worked, you could rewrite them as code, refactor it, disable jailbreaks, and put out a superior product. Your description only covers what an LLM does, not how. Part of the how is that it necessarily predicts multiple words ahead. It wouldn't be possible to write couplets otherwise, and they could do that in the GPT-3 era.

nullc 3 days ago

Some of them believe they are building God, and if they can get there first with their God, they can build it in their image and commandeer the free choice of the rest of humanity by force to ensure there will be no God but their God.

I wish I was kidding. At least that faction is less harmful than the ones who want to use murder to stop AI research.