Anthropic drops flagship safety pledge

2 days ago (time.com)

692 comments

cwwc

I was wondering if it was because of heavy-handedness of the administration, but apparently:

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.

Their core argument is that if we have guardrails that others don't, they would be left behind in controlling the technology, and they are the "responsible ones." I honestly can't comprehend the timeline we are living in. Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.

ACCount37 16 hours ago
That's because it is.
AI is powerful and AI is perilous. Those two aren't mutually exclusive. Those follow directly from the same premise.
If AI tech goes very well, it can be the greatest invention of all human history. If AI tech goes very poorly, it can be the end of human history.
- observationist 16 hours ago
  
  Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.
  -Irving John Good, 1965
  If you want a short, easy way to know what AGI means, it's this: Anything we can do, they can do better. They can do anything better than us.
  If we screw it up, everyone dies. Yudkowsky et al are silly, it's not a certain thing, and there's no stopping it at this point, so we should push for and support people and groups who are planning and modeling and preparing for the future in a legitimate way.
  
  37 replies →
- joshribakoff 16 hours ago
  
  You wouldn’t say that rolling dice is dangerous. You would say that the human who decides to take an action, depending on the value of the dice is the danger. I don’t think AI is dangerous. I think people are dangerous.
  
  4 replies →
- cael450 16 hours ago
  
  Tbh, I find this argument really stupid. The word prediction machine isn’t going to destroy humanity. Sure, humans can do some dumb stuff with it, but that’s about it.
  Stop mistaking science fiction for science.
  
  5 replies →
- overgard 14 hours ago
  
  True of AGI, but what we have right now doesn't fit that bill. (I would encourage people that disagree with this to go talk to ChatGPT about how LLMs and reasoning models work. Seriously! I'm not being snarky. It's very good at explaining itself. If you understand how reasoning works and what an LLM is actually doing it's hard to believe that our current models are going to do much more than become iteratively more precise at mimicking their training datasets.)
- paradox242 15 hours ago
  
  It needs to go well every single day, and only needs to go very poorly once. Not to conflate LLMs with actual super intelligence, but for this (and many other reasons related to basic human dignity), this is not a technology that a responsible society should be attempting to build. We need our very own Butlerian Jihad
  
  1 reply →
- PowerElectronix 16 hours ago
  
  Same with everything, right? You could say the same with nukes, electricity, internet, the computer, etc... But if you look at it without paying attention to the "ultimate tool for humanity" hype, it doesn't really look that much of a threat or a salvation.
  It won't end civilization for dropping the guardrails, but it will surely enable bad actors to do more damage than before (mass scams, blackmail, deepfake nudes, etc.)
  There are companies that don't feel the pressure to make their models play loose and fast, so I don't buy anthropic's excuse to do so.
  
  5 replies →
- tokyobreakfast 14 hours ago
  
  > If AI tech goes very poorly, it can be the end of human history.
  "Just unplug the goddamn thing!"
  Also consider if something is so bad it makes you wince or cringe, then your adversaries are prepared to use it.
  
  2 replies →
- SecretDreams 15 hours ago
  
  > If AI tech goes very well
  The IF here is doing some very heavy lifting. Last I checked, for profit companies don't have a good track record of doing what's best for humanity.
  
  6 replies →
- HardCodedBias 16 hours ago
  
  "If AI tech goes very well, it can be the greatest invention of all human history"
  As has been said at many all hands:
  Let's all work on the last invention needed by humans.
  
  1 reply →
tyre 17 hours ago
“A source familiar with the matter” is almost certainly a company spokesperson.
If they were unrelated, Anthropic wouldn’t be doing this this week because obviously everyone will conflate the two.
- metalliqaz 16 hours ago
  
  yeah that part is 100% BS
Rapzid 17 hours ago
Well before Anthropic thought they were God's gift to AI; the chosen ones protecting humanity.
With the latest competing models they are now realizing they are an "also" provider.
Sobering up fast with ice bucket of 5.3-codex, Copilot, and OpenCode dumped on their head.
- tumdum_ 17 hours ago
  
  Hello sama
  
  1 reply →
tenthirtyam 16 hours ago
I always enjoyed the Terminator movie series, but I always struggled to suspend my disbelief that any humans would give an AI such power without having the ability to override or pull the plug at multiple levels. How wrong I was.
N.B. the time travel aspect also required suspension of disbelief, but somehow that was easier :-)
- zerkten 15 hours ago
  
  We delegate power already. Is unleashing AI in some place different from unleashing JSOC on an insurgency in a particular place? One is code and other is a bunch of humans.
  You expect the humans to follow laws, follow orders, apply ethics, look for opportunities, etc. That said, you very quickly have people circling the wagons and protecting the autonomy of JSOC when there is some problem. In my mind it's similar with AI because the point is serving someone. As soon as that power is undermined, they start to push back. Similarly, they aren't motivated to constrain their power on their own. It needs external forces.
  edit: missed word.
- tim333 10 hours ago
  
  We are currently giving them similar power to the average human idiot because I figure they won't do much worse than those. Letting either launch nukes is different.
jdross 17 hours ago
Would nuclear energy research be a good analogy then? Seems like a path we should have kept running down, but stopped bc of the weapons. So we got the weapons but not the humanity saving parts (infinite clean energy)
- DoughnutHole 16 hours ago
  
  Nuclear advancements slowed down due to PR problems from clear and sometimes catastrophic failure of commercial power plants (Three Mile Island, Chernobyl, Fukushima) and the vastly higher costs associated with building safer plants.
  If anything the weapons kept the industry trucking on - if you want to develop and maintain a nuclear weapons arsenal then a commercial nuclear power industry is very helpful.
- raincole 16 hours ago
  
  Nuclear energy hasn't been slowed down much, let alone stopped. China has been building new reactors every year for more than a decade and there are >30 ones under construction.
  The same will go with AI, btw. Westerners' pearl clenching about AI guardrails won't stop China from doing anything.
  
  1 reply →
- turtlesdown11 17 hours ago
  
  > Seems like a path we should have kept running down, but stopped bc of the weapons.
  you mean like the tens of billions poured into fusion research?
- shafyy 16 hours ago
  
  It's a path we should have never started going down.
whywhywhywhy 17 hours ago

> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons
They're not really, it's always been a form of PR to both hype their research and make sure it's locked away to be monetized.
whatshisface 15 hours ago

Shouldn't we be a little more skeptical about these abstract arguments when a very concrete sale is on the line?
goodmythical 15 hours ago

Isn't curing cancer just as dangerous as a nuclear bomb? Especially considering some of the gene-therapies under consideration? Because you can bet that a non-negligable portion of research in this space is being funded by governments and groups interested in application beyond curing cancer. (Autism? Whiteness? Jewishness? Race in general? Faith in general? Could china finally cure western greed? Maybe we can slip some extra compliancy in there so that the plebia- ah- population is easier to contr- ah- protect.)
Curing all cancers would increase population growth by more than 10% (9.7-10m cancer related deaths vs current 70-80m growth rate), and cause an average aging of the population as curing cancer would increase general life expectancy and a majority of the lives just saved would be older people.
We'd even see a jobs and resources shock (though likely dissimilar in scale) as billions of funding is shifted away from oncologists, oncology departments, oncology wards, etc. Billions of dollars, millions of hospital beds, countless specialized professionals all suddenly re-assigned just as in AI.
Honestly the cancer/nuclear/tech comparison is rather apt. All either are or could be disruptive and either are or could be a net negative to society while posing the possibility of the greatest revolution we've seen in generations.
mikkupikku 16 hours ago

To paraphrase a deleted comment that I thought was actually making a good point, nuclear medicine and nuclear weapons are both fruit from the same tree.
scottLobster 16 hours ago

> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.
Maybe some of the more naive engineers think that. At this point any big tech businesses or SV startup saying they're in it to usher in some piece of the Star Trek utopia deserves to be smacked in the face for insulting the rest of us like that. The argument is always "well the economic incentive structure forces us to do this bad thing, and if we don't we're screwed!" Oh, so ideals so shallow you aren't willing to risk a tiny fraction of your billions to meet them. Cool.
Every AI company/product in particular is the smarmiest version of this. "We told all the blue collar workers to go white collar for decades, and now we're coming for all the white collar jobs! Not ours though, ours will be fine, just yours. That's progress, what are you going to do? You'll have to renegotiate the entire civilizational social contract. No we aren't going to help. No we aren't going to sacrifice an ounce of profit. This is a you problem, but we're being so nice by warning you! Why do you want to stand in the way of progress? What are you a Luddite? We're just saying we're going to take away your ability to pay your mortgage/rent, deny any kids you have a future, and there's nothing you can do about it, why are you anti-progress?"
Cynicism aside, I use LLMs to the marginal degree that they actually help me be more productive at work. But at best this is Web 3.0. The broader "AI vision" really needs to die
coffeefirst 15 hours ago

Let's suppose I believe them, that's still a bad idea.
The reason Claude became popular is because it made shit up less often than other models, and was better at saying "I can't answer that question." The guardrails are quality control.
I would rather have more reliable models than more powerful models that screw up all the time.
kelnos 13 hours ago

"It's not because of the Pentagon deal", says company that has just greased the wheels for said Pentagon deal to move forward.
Riiiiiight.
nextaccountic 12 hours ago

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.
This sounds like a lie. But if they are telling the truth, that's a terrible timing nonetheless.
francisofascii 16 hours ago

It is a "reasonable" argument to keep yourself in the game, but it is sad nonetheless. You sacrifice your morals and do bad things, so if things get way worse, maybe you will be in a position to stop something from really bad from happening. Of course, you might just end up participating in the really bad thing.
austinjp 14 hours ago

> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.
Amd they alone are responsible enough to govern it.
sonusario 14 hours ago
I wonder if it stems from any of the "AI uprising" stories where humanity is viewed as the cancer to be eradicated.
- ajross 14 hours ago
  
  It's absolutely wild that the Big Moral Question of our time is informed as much by mid-20th-century pop science fiction as it is by a existing paradigm from academia or genuine reckoning with the technology itself.
  If anything that makes me more hopeful and not less. It's asking too much that major decisionmakers, even expert/technical/SV-backed ones, really understand the risks with any new technology, and it always has been.
  To take an example: our current mostly-secure internet authentication and commerce world was won as a hard-fought battle in the trenches. The Tech CEOs rushed ahead into the brave new world and dropped the ball, because while "people" were telling them the risks they couldn't really understand them.
  But now? Well, they all saw War Games growing up. They kinda get it in the way that they weren't ever going to grok SQL injection or Phishing.
amelius 15 hours ago

> Their core argument is that if we have guardrails that others don't, they would be left behind in controlling the technology, and they are the "responsible" ones.
Reminds me of:
https://en.wikipedia.org/wiki/Paradox_of_tolerance
which has the same kind of shitty conclusion.
skeptic_ai 16 hours ago
OpenAI never open sourced anything relevant or in time. Internal email leaks they only cared to become billionaires.
Claude only talks about safety, but never released anything open source.
All this said I’m surprised China actually delivered so many open source alternatives. Which are decent.
Why westerns (which are supposed to be the good guys) didn’t release anything open source to help humanity ? And always claim they don’t release because of safety and then give the unlimited AI to military? Just bullshit.
Let’s all be honest and just say you only care about the money, and whomever pays you take.
They are businesses after all so their goal is to make money. But please don’t claim you want to save the world or help humans. You just want to get rich at others expenses. Which is totally fair. You do a good product and you sell.
- motbus3 16 hours ago
  
  It is hard to understand why other ai companies are still providing models weights at this point
  My guess is that they know they are not competitors so they make it cheaper or free to hinder the surge of a super competitor.
- pixl97 16 hours ago
  
  I mean, if you have a bunch of guns, it's not really helpful for humanity to dump them on the street, but it does bring up the question of what you're doing building guns in the first place.
- tehjoker 14 hours ago
  
  > Claude only talks about safety, but never released anything open source.
  im still working through this issue myself but hinton said releasing weights for frontier models was "crazy" because they can be retrained to do anything. i can see the alignment of corporate interest and safety converging on that point.
  from the point of view of diminishing corporate power i do think it is essential to have open weights. if not that, then the companies should be publicly owned to avoid concentration of unaccountable power.
  https://www.youtube.com/watch?v=66WiF8fXL0k&t=544s
toss1 7 hours ago

Excellent news. I was seriously worried they would cave when I saw the earlier news they'd dropped their core safety pledge [0].
It is entirely reasonable to not provide tools to break the law by doing mass surveillance on civilian citizens and to insist the tool not be used automatically to kill a human without a human in the loop. Those are unreasonable demands by an unreasonable regime.
[0] https://news.ycombinator.com/item?id=47145963
oatmeal1 14 hours ago
90% of the people cancer kills are over 50. Old people who start believing everything they see on Facebook, but continue voting, with even greater confidence in their opinions. Old people who voted in Trump. Curing cancer would be just about the worst thing AI could do.
- cnd78A 8 hours ago
  
  Unless Ai could cure the Flynn effect you are talking about, it result from the cultural evolution. Natural evolution is dumb unlike the one AI could create (I bet it will either destroy us or make us smarter)
afavour 16 hours ago
It's exhausting to keep with mainstream AI news because of this. I can never work out if the companies are deluded and truly believe they're about to create a singularity or just claiming they are to reassure investors/convince the public of their inevitability.
- ACCount37 16 hours ago
  
  It's a fairly mainstream position among the actual AI researchers in the frontier labs.
  They disagree on the timelines, the architectures, the exact steps to get there, the severity of risks. Can you get there with modified LLMs by 2030, or would you need to develop novel systems and ride all the way to 2050? Is there a 5% chance of an AI oopsie ending humankind, or a 25% chance? No agreement on that.
  But a short line "AGI is possible, powerful and perilous" is something 9 out of 10 of frontier AI researchers at the frontier labs would agree upon.
  At which point the question becomes: is it them who are deluded, or is it you?
  
  14 replies →
- grayhatter 16 hours ago
  
  > I can never work out if the companies are deluded and truly believe they're about to create a singularity or just claiming they are to reassure investors/convince the public of their inevitability.
  You can never figure out if the people selling something are lying about it's capabilities, or if they've actually invented a new form of intelligence that can rival or surpass billions of years of evolution?
  I'd like to introduce you to Occam Razor
  
  5 replies →
3acctforcom 12 hours ago

I lie too.
moogly 15 hours ago

"Those other companies are totally going to build the Torment Nexus, so we have no choice but to also build the Torment Nexus."
cmrdporcupine 15 hours ago
We all made fun of Blake Lemoine and others for spending too many late nights up chatting with (ridiculously primitive by this year's standards) LLM chat bots and deciding they were sentient and trapped.
But frankly I feel like the founders of Anthropic and others are victim of the same hallucination.
LLMs are amazing tools. They play back & generate what we prompt them to play back, and more.
Anybody who mistakes this for SkyNet -- an independent consciousness with instant, permanent, learning and adaptation and self-awareness, is just huffing the fumes and just as delusional as Lemoine was 4 years ago.
Everyone of of us should spend some time writing an agentic tool and managing context and the agentic conversation loop. These things are primitive as hell still. I still have to "compact my context" every N tokens and "thinking" is repeating the same conversational chain over and over and jamming words in.
Turns out this is useful stuff. In some domains.
It ain't SkyNet.
I don't know if Anthropic is truly high on their own supply or just taking us all for fools so that they can pilfer investor money and push regulatory capture?
There's also a bad trait among engineers, deeply reinforced by survivor bias, to assume that every technological trend follows Moore's law and exponential growth. But that applie[s|d] to transistors, not everything.
I see no evidence that LLMs + exponential growth in parameters + context windows = SkyNet or any other kind of independent consciousness.
- overgard 14 hours ago
  
  I think playing with the API's is something I'd encourage people excited about these technologies to do. I think it'll lead to the "magic" wearing off but more appreciation for what they actually can accomplish.
- austinjp 14 hours ago
  
  I always feel this argument misses a point. SkyNet may still be a long way off, but autonomous killer drones are here. That is a bad situation my dudes.
  Every step on the journey towards SkyNet is worse than the preceding step. Let's not split hairs about which step we're on: it's getting worse, and we should stop that.
  
  2 replies →
api 13 hours ago

The fear mongering always struck me as mostly a bid for regulatory capture and a moat, because without that the moat is small and transient.

latexr 2 days ago

> “We felt that it wouldn't actually help anyone for us to stop training AI models,”

How magnanimous! They are only thinking of others, you see. They are rejecting their safety pledge for you.

> “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.

For all of you who thought Anthropic were “the good guys”, I hope this serves as a wake up call that they were always all the same. None of them care about you, they only care about winning.

isodev 2 days ago
Indeed, Anthropic can’t afford to be the ones that impose any kind of sense in the market - that’s supposed to be the job of the government by creating policy, regulations and installing watchdogs to monitor things.
But lucky for the AI companies, most of them are based in place that only has a government on paper and everyone forgot where that paper is.
- votepaunchy 6 hours ago
  
  > that’s supposed to be the job of the government by creating policy, regulations and installing watchdogs to monitor things
  But that government cannot trust the other government on the other side of the world to implement the same restrictions, so we find ourselves in this Nash equilibrium.
- nickserv 2 days ago
  
  The government is why they are dropping their pledge.
  https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...
  
  14 replies →
akudha 2 days ago
they only care about winning
To be fair, this is true in nearly all industries and for nearly all companies. Almost everyone is chasing money and monopoly. Not that it makes it right, just pointing out it isn’t unique or even interesting about the AI companies
- amunozo 1 day ago
  
  Of course, but Anthropic is particularly insufferable in this respect.
nsbk 2 days ago
Since it is all about money, I just did vote with my wallet and cancelled the Max subscription
- nullocator 2 days ago
  
  If you're a U.S. citizen, tax dollars from you and others will backstop any cancelled subscriptions, I guess good on you for not trying to pay them twice, though you get zero benefit with this approach.
  
  15 replies →
watwut 2 days ago
> Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.
I mean, yes, that is actually how world works. That is why we need safety, environmental and other anti-fraud regulations. Because without them, competition makes it so that every successful company will fraud, hurt and harm. Those who wont will be taken over by those who do.
- rco8786 2 days ago
  
  Yes, this. It's unfortunate that anthropic dropped this and it's also exactly how the system is supposed to work. Companies don't regulate themselves, the government regulates the companies.
  Now, you may notice that the government is also choosing not to regulate these companies...which is another matter altogether.
  
  10 replies →
- latexr 2 days ago
  
  > I mean, yes, that is actually how world works.
  And soon enough, it won’t work at all because of it.
  > Those who wont will be taken over by those who do.
  And if you compromise on your core values because of money, they weren’t core values to begin with¹. “I want to be ethical but if I am I won’t get to be a billionaire” isn’t an excuse. We shouldn’t just shrug our shoulders at what we see as wrong because “everybody does it” or “that’s just business” or “that’s life”. Complacency and apologists are how a bad system remains bad.
  https://www.newyorker.com/cartoon/a16995
  ¹ I’m willing to give leeway to individuals. You can believe stealing is wrong but if you’re desperate and steal a loaf of bread to feed your kid, there’s nuance. A VC-backed company is something entirely different.
- freejazz 13 hours ago
  
  Anthropic posits itself as a public benefit corp
davidguetta 2 days ago
[flagged]
- floatrock 2 days ago
  
  Was there actually a case of a model saying "America's founding father were black women", or is that just Elon fingering your amygdala with a ridiculous hypothetical that exists nowhere other than Elon's mind in order to justify Elon's personal bias tweaks when he doesn't like the wisdom-of-the-crowds answer his tools initially give?
  
  6 replies →
- wattsy2025 2 days ago
  
  The most important part of AI safety is AI alignment: making sure AI does what we want. It's very hard because even if AI isn't trying to deceive you it can have bad outcomes by executing your request to the letter. The classical example is tasking an AI to make paperclips, training the AI with a reward for making more paperclips. Then the AI makes the most paperclips possible by strip mining the Earth and killing anything in its way.
  Sometimes you see this AI alignment problem in action. I once asked an older model to fix the tests and it eventually gave up and just deleted them
- chasd00 2 days ago
  
  > Still waiting for an explicit answer on understand how 'safety' is truly distinguishable from 'censorship' or 'political correctness'
  i've said this many times but the concept of ai "safety" is really brand safety. What Anthropic is saying is they're willing to risk some bad press to bypass the additional training and find tuning to ensure their models do not output something people may find outrageous.
- SlinkyOnStairs 2 days ago
  
  > I VERY LARGELY prefer an AI like grok that doesn't pretend and let the onus of interpretation to the user rather than a bunch of anonymous "researchers" that may be equally biased, at the extreme, may tell you that America's founding father were black women
  Setting aside for a moment that Grok is manipulated and biased to a hilarious extent. ("Elon is world champion at everything, including drinking piss")
  There is no such thing as "unbiased". There will always be bias in these systems, whether picked up from the training data, or the choices made by the AI's developers/researchers, even if the latter doesn't "intend" to add any bias.
  Ignoring this problem doesn't magically create a bias-free AI that "speaks the truth about the founding fathers". The bias in the training data, the implicit unconcious bias in the design decisions, that didn't come out of thin air. It's just somebody else's bias.
  All the existing texts on the founding fathers are filled with 250 years of bias, propaganda, and agenda pushing from all sorts of authors.
  There is no way to have no bias, no propaganda, no "agenda pushing" in the AI. The only thing that can be done is to acknowledge this problem, and try to steer the system to a neutral position. That will be "agenda pushing" of one's own, but that's the reality of all history and all historians since Herodotus. You just have to be honest about it.
  And you will observe that current AI companies are excessively lazy about this. They do not put in the work, but instead slap on a prompt begging the system to "pls be diverse" and try to call it a day. This does not work.
  > Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation.
  Bear in mind that the context of Anthropic's pivot here are the Pentagon's dollars.
  This isn't just about "anti-woke AI", it's about killbots.
  Sure, Hegseth wants his robots to not do thoughtcrime about, say, trans people or the role of women in the military.
  But above all he wants to do a lot of murder.
  Antrophic dropping their position of "We shouldn't turn this technology we can barely control into murder machines" because they're running out of money is damnable.
  
  4 replies →
- gehwartzen 2 days ago
  
  Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc
  
  5 replies →
- miltonlost 2 days ago
  
  david guetta, if that really is you, stick to music rather than using Nazi man's propaganda machine
surgical_fire 2 days ago
> For all of you who thought Anthropic were “the good guys”
Was anyone fooled by this?
I mean, I know this is HN and there is a demographic here that gets all misty eyed about the benevolence of corporations.
It takes a special kind of naivety to believe in those claims.
- pjmlp 18 hours ago
  
  Plenty of people here actually bought into the do no evil, how great Apple is for the environment (with throw away soldered hardware), or whatever.
  
  1 reply →
high_na_euv 2 days ago
But what really AI safety is?
Censorship?

drzaiusx11 18 hours ago

Public benefit corporations in the AI space have become a farce at this point. They're just regular corporations wearing a different hat, driven by the same money dynamics as any other corp. They have no ability to balance their stated "mission" with their drive for profit. When being "evil" is profitable and not-evil is not, guess which road they'll take...

coldtea 18 hours ago
In general public benefit corporations and non-profits should have a very modest salary cap for everybody involved and specific public-benefit legally binding mission statements.
Anybody involved should also be prohibited from starting a private company using their IP and catering to the same domain for 5-10 years after they leave.
Non-profits where the CEO makes millions or billions are a joke.
And if e.g. your mission is to build an open browser, being paid by a for-profit to change its behavior (e.g. make theirs the default search engine) should be prohibited too.
- ACCount37 16 hours ago
  
  "A very modest salary cap" works if your mission is planting trees. Not so much if what you're building is frontier AI systems.
  
  9 replies →
- jkestner 17 hours ago
  
  It’s not the CEO’s fault - they had to take all that money to keep their org a non-profit.
  B corps are like recycling programs, a nice logo.
  
  6 replies →
- drzaiusx11 18 hours ago
  
  If we're speaking in generalities of corporations in this space, it's all a joke now, at least from my vantage point. I just don't find it very funny.
- OkayPhysicist 14 hours ago
  
  You're overthinking this. Just give the beneficiaries of the corporation (which in the context of a "public" benefit corporation is the public) the grounds to sue if the company reneges on their mission, the same way shareholders can sue if a company fails to act in their interest.
- abigail95 17 hours ago
  
  What's the salary cap for hiring a team to build a frontier model? These kind of rules will make PBCs weaker not stronger.
  
  1 reply →
heavyset_go 17 hours ago
PBCs are peak End of History liberal philanthropy that speak to the kind of person whose solution to any problem is "throw a startup at it"
- nozzlegear 16 hours ago
  
  Fukuyama wasn't wrong, he was just early
  
  1 reply →
vharish 16 hours ago
Like Google's old motto, 'Do no evil!' :D
- thih9 13 hours ago
  
  > 'Do no evil!'
  “Don’t be evil”. But yes, this behavior made me think about Google too. Context: https://en.wikipedia.org/wiki/Don%27t_be_evil
latexr 16 hours ago

> Public benefit corporations in the AI space have become a farce at this point.
“At this point”? It was always the case, it’s just harder to hide it the more time passes. Anyone can claim anything they want about themselves, it’s only after you’ve had a chance to see them in the situations which test their words that you can confirm if they are what they said.
logicallee 16 hours ago
>Public benefit corporations in the AI space have become a farce at this point. They're just regular corporations wearing a different hat, driven by the same money dynamics as any other corp.
Could you describe the model that you think might work well?
- nozzlegear 16 hours ago
  
  It sounds like OP thinks AI companies should just stop pretending that they care about the public benefit, and be corporations from the start. Skip the hand wringing and the will they/wont they betray their ethics phases entirely since everyone knows they're going to choose profit over public benefit every time.
  That model already exists and has worked well for decades. It's called being a regular ass corporation.
  
  2 replies →
Schlagbohrer 16 hours ago
Pete Hegseth also threatened to take, by dictat, everything Anthropic has. He can do that with the Defense Industrial Act or whatever its called if he designates them as critical to national defense.
- nozzlegear 16 hours ago
  
  It would've been better PR for Anthropic to let Hegseth do that instead of fold at the slightest hint of pressure and lost contract money. I've canceled my Claude subscription over this (and made sure to let them know in the feedback).
- bn_layc 16 hours ago
  
  He seems to be the driving force behind all this. Mediocrities are attracted to AI like moths.
  The press always say "the Pentagon negotiates". Does any publication have an evidence that it is "the Pentagon" and not Hegseth? In general, I see a lot of common sense from the real Pentagon as opposed to the Secretary of War.
  I hope Westpoint will check for AI psychosis in their entrance interviews and completely forbid AI usage. These people need to be grounded.
  
  2 replies →
- lprhrp 16 hours ago
  
  Hmm, that could be the best "IPO" they'll ever get. Better check if Trump Jr.'s 1789 capital has shares like they did in groq (note the "q").
Forgeties79 17 hours ago
I feel like we went through this exact situation in the 2010s of social media companies. I don’t get why people defend these companies or ever believe they have any sense of altruism
- kelvinjps10 16 hours ago
  
  Also, it seems to be the era where the government takes backdoor access to these services and data, as the did with social media
bparsons 16 hours ago

That's not what happened here. They literally got forced into it by the Pentagon. https://www.axios.com/2026/02/24/anthropic-pentagon-claude-h...
lenerdenator 17 hours ago
Well, now I'm wondering, if the company was chartered with the public benefit in mind, could you not sue if they don't follow through with working in the public interest?
If regular corporations are sued for not acting in the interests of shareholders, that would suggest that one could file a suit for this sort of corporate behavior.
I'm not even a lawyer (I don't even play one on TV) and public benefit corporations seem to be fairly new, so maybe this doesn't have any precedent in case law, but if you couldn't sue them for that sort of thing, then there's effectively no difference between public benefit corporations and regular corporations.
- hluska 17 hours ago
  
  I really don’t see it. PBCs are dual purpose entities - under charter, they have a dual purpose of making profit while adding some benefit to society. Profit is easy to define; benefit to society is a lot more difficult to define. That difficulty is reflected at the penalty stage where few jurisdictions have any sort of examination of PBC status.
  This is what we were all going on about 15 years ago when Maryland was the first state to make PBCs legal. We got called negative at the time.
  
  1 reply →
- Hamuko 17 hours ago
  
  I think public benefit corporations (like Anthropic) are quite poorly defined so I'm not sure how successful a lawsuit is.
neya 16 hours ago
I was a Pro subscriber until last week. When I was chatting with Claude, it kept asking a lot of personal questions - that seemed only very very vaguely relevant to the topic. And then it struck me - all these AI companies are doing are just building detailed user models for being either targeted for advertising or to be sold off to the highest bidder. It hasn't happened yet with Anthropic, but when the bubble money runs out, there's not gonna be a lot of options and all we'll see is a blog post "oops! sorry we did what we promised you we wouldn't". Oldest trick in the tech playbook.
- dibujaron 16 hours ago
  
  A less cynical explanation: It's heavily trained to ask follow-up questions at the end of a response, to drive more conversation and more engagement. That's useful both for making sure you want to renew your subscription, and also probably for generating more training data for future models. That's sufficient explanation for the behavior we're seeing.
  
  2 replies →

heftykoo 2 days ago

Ah, the classic AI startup lifecycle:

We must build a moat to save humanity from AI.

Please regulate our open-source competitors for safety.

Actually, safety doesn't scale well for our Q3 revenue targets.

baq 2 days ago
Foundational model provider manifesto:
‘While there’s value in safety, we value the Pentagon’s dollars more’
- pera 2 days ago
  
  It turns out the biggest threat to AI safety is capitalism, who would have thought
  
  15 replies →
dmix 2 days ago
Once they are a dominant market leader they will go back to asking the government to regulate based on policy suggestions from non-profits they also fund.
- amelius 2 days ago
  
  As if their shareholders would agree.
- nielsbot 2 days ago
  
  Is this sarcasm?
  
  7 replies →
yesimahuman 2 days ago

The only surprise is how quickly it all happened!
jwr 2 days ago

It's not just AI, replace "safe" with "open" and you will find a close match with many companies. I guess the difference is that after the initial phase, we are continuously being gaslighted by companies calling things "open" when they are most definitely not.
varispeed 2 days ago

Politicians also love to regulate, especially over wine and steak and when the watchers don't watch.

lebovic 2 days ago

I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.

That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.

This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to get and keep a seat at the table.

Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.

baq 2 days ago
> I hope they're willing to risk losing their seat at the table to be guided by values.
that's about as naive as it can be.
if they have any values left at all (which I hope they have) them not being at the table with labs which don't have any left is much worse than them being there and having a chance to influence at least with the leftovers.
that said, of course money > all else.
- lebovic 2 days ago
  
  I don't hold the belief that it's always better to have influence in a group where you don't trust leadership – in this case, those who decide at the metaphorical table – vs. trying to affect change through a different avenue.
  It's probably naive, but it's also the reasoning that drove many early employees to Anthropic. Maybe the reasoning holds at smaller scales but breaks down when operating as a larger actor (e.g. as a single person or startup vs. a large company).
- moron4hire 2 days ago
  
  This is a common logical fallacy. It's not true that the party A with a few values can influence the party B with no values. It's only ever the case that party B fully drags party A to the no-values side. See also: employees who rationalize staying at companies running unethical or illegal projects.
  
  1 reply →
sebastiennight 2 days ago

> I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario
Pledges are generally non-binding (you can pledge to do no evil and still do it), but fulfill an important function as a signal: actively removing your public pledge to do "no evil" when you could have acted as you wished anyway, switches the market you're marketing to. That's the most worrying part IMO.
jappgar 2 days ago
If you're not willing to give up your RSUs you shouldn't be surprised that the executives aren't either.
The moral failing is all of ours to share.
- lebovic 2 days ago
  
  I was willing to (and did) give up my equity.
paxys 2 days ago
I interviewed at Anthropic last year and their entire "ethics" charade was laughable.
Write essays about AI safety in the application.
An entire interview round dedicated to pretending that you truly only care about AI safety and not the money.
Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world.
In reality it was a mid-level manager interviewing a mid-level engineer (me), both putting on a performance while knowing fully well that we'd do what the bosses told us to do.
And that is exactly what is happening now. The mission has been scrubbed, and the thousands of "ethical" engineers you hired are all silent now that real money is on the line.
- lebovic 1 day ago
  
  > Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world
  I was an interviewer, and I wasn't encouraged to talk about philanthropy, effective altruism, or ethics. Maybe even slightly discouraged? My last two managers didn't even know what effective altruism was. (Which I thought was a feat to not know months into working there.)
  When did you interview, and for what part of the company?
  > knowing fully well that we'd do what the bosses told us to do [...] now that real money is on the line
  This is a cynical take.
  I didn't just do what I was told, and I dissented with $XXM in EV on the line. But I also don't work there anymore, at least one of the cofounders wasn't happy about it and complained to my manager, and many coworkers thought I had no sense of self preservation – so I might be naive.
  The more realistic scenario is that a) most people have good intentions, b) there's a decision that will cause real harm, and c) it's made anyway to keep power / stay on the frontier, with the justification that the overall outcome is better. I think that's what happened here.
  
  1 reply →
hvsr4z 2 days ago

The EU should invite them over.
The kind of principles you talk about can only be upheld one level up the food chain. By govts.
Which is why legislatures, the supreme court, central banks, power grid regulators deciding the operating voltage and frequency auto emerge in history. Cause corporations structurally cant do what they do without voilating their prime directive of profit maximization.
tootie 2 days ago
I fully believe that Dario is 100% full of shit and possibly a worse person than Altman. He loves to pontificate like he's the moral avatar of AI but he's still just selling his product as hard as he can.
- monkeydust 2 days ago
  
  They are all the same given their motivations - Demis Hassabis is the only one who, to me at least, sounds genuine on stage.
  
  1 reply →
fatata123 2 days ago

[dead]

sfink 2 days ago

I guess this is Anthropic's DRM moment. (Mozilla resisted allowing Firefox to play DRM- limited media for a long time, until it finally had to give in to stay relevant.)

I don't know enough to evaluate this or other decisions. I'm just glad someone is trying to care, because the default in today's world is to aggressively reject the larger picture in favor of more more more. I don't know how effective Anthropic's attempts to maintain some level of responsibility can be, but they've at least convinced me that they're trying. In the same way that OpenAI, for example, have largely convinced me that they're not. (Neither of those evaluations is absolute; OpenAI could be much worse than it is.)

bbatsell 2 days ago

This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".

ruszki 2 days ago
> This article has nothing to do with the current tête-à-tête with the Pentagon.
The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.
- tbrownaw 2 days ago
  
  This is something they've been working on "in recent months". The Pentagon thing was today.
  This cannot have been caused by that, unless they've also invented time travel.
  
  9 replies →
- benatkin 2 days ago
  
  I think we can confidently claim that it is related. I wonder if I'm alone in thinking this.
ameliaquining 2 days ago
I consider this a bigger deal than the Pentagon thing.
- baq 2 days ago
  
  It’s the same deal
- ActorNightly 2 days ago
  
  While not surprising at the least, it still kind of crazy that literal pdf files in charge is not concerning, but this is.
  I just hope something happens to USA before it can do damage to the world.
  
  5 replies →

honeycrispy 16 hours ago

Anthropic's CEO Dario has annoyed me to no end with his "AI will take all the jobs in 6 months" doomer speeches on every podcast he graces his presence with.

keeda 14 hours ago
I think he's right and we should be thinking about this a lot more. Even the IMF is worried about 40 - 60% of global employment: https://www.imf.org/en/blogs/articles/2024/01/14/ai-will-tra...
Focusing on Dario, his exact quote IIRC was "50% of all white collar jobs in 5 years" which is still a ways off, but to check his track record, his prediction on coding was only off by a month or so. If you revisit what he actually said, he didn't really say AI will replace 90% of all coders, as people widely report, he said it will be able to write 90% of all code.
And dhese days it's pretty accurate. 90% of all code, the "dark matter" of coding, is stuff like boilerplate and internal LoB CRUD apps and typical data-wrangling algorithms that Claude and Codex can one-shot all day long.
Actually replacing all those jobs however will take time. Not just to figure out adoption (e.g. AI coding workflows are very different from normal coding workflows and we're just figuring those out now), but to get the requisite compute. All AI capacity is already heavily constrained, and replacing that many jobs will require compute that won't exist for years and he, as someone scrounging for compute capacity, knows that very well.
But that just puts an upper limit on how long we have to figure out what to do with all those white collar professionals. We need to be thinking about it now.
- honeycrispy 14 hours ago
  
  He's not right though. He's trying to scare the market into his pocket. It's well established that AI just turns devs into AI babysitters that are 10% more productive and produce 200% the bugs, and in the long-term don't understand what they built.
  
  3 replies →
- overgard 13 hours ago
  
  > Focusing on Dario, his exact quote IIRC was "50% of all white collar jobs in 5 years" which is still a ways off, but to check his track record, his prediction on coding was only off by a month or so. If you revisit what he actually said, he didn't really say AI will replace 90% of all coders, as people widely report, he said it will be able to write 90% of all code.
  Ugh, people here seem to think that all software is react webapps. There are so many technologies and languages this stuff is not very good at. Web apps are basically low hanging fruit. Dario hasn't predicted anything, and he does not have anyone's interests other than his own in mind when he makes his doomer statements.
  
  4 replies →
- bdangubic 14 hours ago
  
  > 90% of all code, the "dark matter" of coding, is stuff like boilerplate and internal LoB CRUD apps and typical data-wrangling algorithms that Claude and Codex can one-shot all day long.
  most of us are getting paid for the other 10%
  
  3 replies →
sneilan1 14 hours ago
I don't understand why some of these AI companies check their egos at the door and hire public relations companies. Yes, I understand they are changing the world but customers do not open their wallets when they are scared. Very few people I know are as avant-guarde as I am with AI, but, most people look at these new technologies and simply feel fear. Why pay for something that will replace you?
- honeycrispy 14 hours ago
  
  He knows what he's doing.
  It's to drive FOMO for investors. He needs tens of billions of capital and is trying to scare them into not looking at his balance sheet before investing. It's reckless, and is soaking up capital that could have gone towards more legitimate investments.
  
  1 reply →
- freejazz 10 hours ago
  
  > public relations companies.
  Sounds like one of the white collar jobs that LLMs were supposed to solve
logravia 15 hours ago

It certainly is. For people who have not heard the statements, here are some quotes. I bring them up, because I think it's worthwhile to remember the bold predictions that are made now and how they will pan out in the future.
Council on Foreign Relations, 11 months ago: "In 12 months, we may be in a world where AI is essentially writing all of the code."
Axios interview, 8 months ago: "[...] AI could soon eliminate 50% of entry-level office jobs."
The Adolescence of Technology (essay), 1 month ago: "If the exponential continues—which is not certain, but now has a decade-long track record supporting it—then it cannot possibly be more than a few years before AI is better than humans at essentially everything."
pier25 15 hours ago

Also "AGI is just around the corner".
upmind 15 hours ago
+1, he also has this viewpoint that no other lab will be able to "contain" AI and has a general doomer outlook on AI which I don't appreciate.
- saalweachter 14 hours ago
  
  To be fair, it's hilarious how much verbiage was spent discussing AI 'getting out of the box', when the first thing everyone did with LLMs was immediately throw away the box and go "Here! Have the internet! Here! Have root access! Want a robot body? I'll get you a robot body."
agoodusername63 14 hours ago

It makes me wonder why he has the job of CEO then if he's so confident that the technology will destroy the world.
Don't worry, I know exactly why. $
lbhdc 14 hours ago

What I find so funny about heads of AI companies coming out saying things like this, is their own career pages suggest they don't actually feel that way.
https://www.anthropic.com/careers/jobs
moomoo11 15 hours ago
He’s an e/acc guy. That should tell you everything. And maybe the incredibly awkward behavior and demeanor.
- slfnflctd 15 hours ago
  
  "Y'know, like, the thing is, like, y'know, here's the thing..."
  I totally feel for people with speech pathologies or anxiety that makes it harder for them to communicate verbally, but how is this guy the public face of the company and doing all these interviews by himself? With as much as is at stake, I find it baffling.
  
  1 reply →
mgraczyk 14 hours ago

When did he say this?
jobs_throwaway 14 hours ago
He's annoyed me most with the way he speaks. I'm not sure if its a tick or what but the way he'll repeat a word 10x before starting a sentence is painful to listen to.
- sneilan1 14 hours ago
  
  Yes, the CEO's of these AI companies are clearly not the people who should be selling AI products. They need to be hidden away and kept behind closed doors where they can do their best work. And they need advertising companies, PR firms and better marketing tactics to try and soothe the customers.

sigbottle 16 hours ago

There's one tweet from the the blog a few days ago (astral something?) that sums up my view of the problem pretty well.

General population: How will AI get to the point where it destroys humanity?

Yudkowsky: [insert some complicated argument about instrumented convergence and deception]

The government: because we told you to.

Again, not saying that AI is useless or anything. Just that we're more likely to cause our own downfall with weaker AI, than some abstract super AGI. The bar for mass destruction and oppression is lower than the bar for what we typically think of as intelligence for the benefit for humanity ( with the right systems in place, current AI systems are more than enough to get the job done - hence why the Pentagon wants it so bad...)

Rapzid 2 days ago

How is this article not going to even mention the recent threats to Anthropic from the Government?!

pera 2 days ago
This was on the news yesterday:
> The meeting between Hegseth and Amodei was confirmed by a defense official who was not authorized to comment publicly and spoke on condition of anonymity.
https://fortune.com/2026/02/24/hegseth-to-meet-with-anthropi...
- lukan 2 days ago
  
  How about this quote instead?
  "Defense Secretary Pete Hegseth has threatened Anthropic, saying officials could invoke powers that would allow the government to force the artificial intelligence firm to share its novel technology in the name of national security if it does not agree by Friday to terms favorable to the military"
  https://www.washingtonpost.com/technology/2026/02/24/pentago...
  
  1 reply →
uoaei 2 days ago

Consent manufacturing
taurath 2 days ago

That’s how they got the exclusive. Good catch
Sammi 2 days ago

Not one single mention of Hegseth in the whole article. What a bunch of tools.
Noaidi 2 days ago
I mean seriously, is this not the very definition of fascism?
"n general, fascist governments exercised control over private property but they did not nationalize it. Scholars also noted that big business developed an increasingly close partnership with the Italian Fascist and German Nazi governments after they took power. Business leaders supported the government's political and military goals. In exchange, the government pursued economic policies that maximized the profits of its business allies.[8]"
- edgyquant 2 days ago
  
  All governments do this
  
  2 replies →

FitchApps 17 hours ago

"AI Company with Soul" - yeah right until competitors show up / revenue drops / bad quarter results then anything goes. Sadly, this is another large enterprise that puts profits before ethics and everyone's wellbeing

thinkingtoilet 17 hours ago
This is direct pressure from the government. Classic 'small government' Republican stuff.
https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...
- gizmodo59 16 hours ago
  
  That’s their excuse to still appeal to people who can be tricked with their safety first pitch. It’s easy to have constitution and all the crap when you are not battle tested. They just showed their true colors.

ndr 17 hours ago

Worth checking this post from someone who actually has worked on this change:

> I take significant responsibility for this change.

https://www.lesswrong.com/posts/HzKuzrKfaDJvQqmjh/responsibl...

bhouston 17 hours ago
This guy from Effective Altruism pivoted away from helping the poor to help try to control AI from being a terminator type entity and then pivoted to being, ah, its okay for it to be a terminator type entity.
> Holden Karnofsky, who co-founded the EA charity evaluator GiveWell, says that while he used to work on trying to help the poor, he switched to working on artificial intelligence because of the “stakes”:
> “The reason I currently spend so much time planning around speculative future technologies (instead of working on evidence-backed, cost-effective ways of helping low-income people today—which I did for much of my career, and still think is one of the best things to work on) is because I think the stakes are just that high.”
> Karnofsky says that artificial intelligence could produce a future “like in the Terminator movies” and that “AI could defeat all of humanity combined.” Thus stopping artificial intelligence from doing this is a very high priority indeed.
https://www.currentaffairs.org/news/2022/09/defective-altrui...
He is just giving everyone permission to do bad things by saying a lot of words around it.
- samjewell 17 hours ago
  
  > then pivoted to being, ah, its okay for it to be a terminator type entity.
  Isn’t that the opposite of what he’s saying? He’s saying it could become that powerful, and given that possibility it’s incredibly important that we do whatever we can to gain more control of that scenario
  
  3 replies →
- drdrek 16 hours ago
  
  Effective Altruism is such a beautiful term for a pretentious Karen that needs to wrap their selfish actions with moral superiority.
  It's that perfect blend of I'm doing what everyone else are doing, and I'm better than everyone else.
  Chefs' Kiss
- barbarr 15 hours ago
  
  Getting SBF vibes from this. "Earn to give" is an inherently flawed philosophy.
- SpaceManNabs 15 hours ago
  
  Effective altruism came from the "rationalist"
  It was never about helping poor people.
  For some reason, the rationalist movement and its offshoots are really pervasive in silicon valley. i don't see it much in the other tech cities.
riffraff 17 hours ago
> I generally think it’s bad to create an environment that encourages people to be afraid of making mistakes, afraid of admitting mistakes and reticent to change things that aren’t working
"move fast and break things" ?
- freejazz 16 hours ago
  
  "don't hold me liable"
pimlottc 16 hours ago

> > I take significant responsibility for this change.
Empty words. I would like to know one single meaningful way he will be held responsible for any negative effects.
adverbly 16 hours ago
Did this guy actually write this?
Incredibly long and verbose. I will fall short of accusing him of using an AI to generate slop, but whatever happened to people's ability to make short, strong, simple arguments?
If you can't communicate the essence of an argument in a short and simple way, you probably don't understand it in great depth, and clearly don't care about actually convincing anybody because Lord knows nobody is going to RTFA when it's that long...
At best, you're just trying to communicate to academics who are used to reading papers... Need to expect better from these people if we want to actually improve the world... Standards need to be higher.
- ozozozd 16 hours ago
  
  Perhaps they didn’t have the time to write a shorter version.
  Or the discipline.
  Maybe neither.
- s1artibartfast 16 hours ago
  
  This is where people go to post long verbose statements.
  You can usually find the short version on Twitter.
- mock-possum 14 hours ago
  
  This style is in vogue for the less wrong community.
jplusequalt 17 hours ago
I genuinely believe that website is responsible for a lot of the worst ideas currently permeating the technology sector.
- prodigycorp 16 hours ago
  
  pretty much the intellectual equivalent of looksmaxxing
  
  1 reply →

chris_money202 2 days ago

First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.

Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.

Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.

hsbauauvhabzb 2 days ago
Plenty of people have said plenty. The problem isn’t the warnings, it’s that people are too stupid and greedy to think about the long term impacts.
- Valakas_ 2 days ago
  
  And what makes them being "stupid" and "greedy"? One's intelligence is determined by genes, and greediness is a trait that natural selection has favored for millennia. This is just natural selection taking its course, and it might lead to our end.
  If you want to blame something, blame math. Math has determined the physical constants and equations that determine the chemistry and ultimately biology laws that has resulted in humans being the way they are.
- ifh-hn 2 days ago
  
  Maybe it's how blunt this comment is that gets it downvoted, but I don't disagree.
  
  6 replies →
ashtonshears 2 days ago
The societal ills from collective tendancy to ignore red flags seems to be a human trait
- AndrewKemendo 2 days ago
  
  It's in your nature to destroy yourselves
  
  4 replies →
palmotea 2 days ago

> First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.
> Then they ignored the researchers warning about what it could do, and I...
...tried it and became an eager early adopter and evangelist. It sounded like something from a dystopian science function novel I enjoyed.
> Then [I] gave it control of things that matter, power grids, hospitals, weapons, and...
...my startup was doing well, and I was happy. We should be profitable next quarter.
> Then something went wrong, and no one knew how to stop it, no one had planned for it...
...and I was guilty as fuck,
FTFY, to fit the HN crowd.
zer00eyz 2 days ago
> Then something went wrong, and no one knew how to stop it,
This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.
If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.
We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".
A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.
- mitthrowaway2 2 days ago
  
  I don't think it's that detached from reality.
  If an AI in some data center had gone rogue, I don't think I could shut it down, even with a high-powered rifle. There's a lot of people whose job it is to stop me from doing that, and to get it running again if I were to somehow succeed temporarily. So the rogue AI just has to control enough money to pay these people to do their jobs. This will work precisely because the world is "I, Pencil".
  An army could theoretically overcome those people, given orders to do so. So the rogue AI has to make plans that such orders would not be issued. One successful strategy is for the datacenter's operation to be very profitable; it's pretty rare for the government to shut down the backbone of the local economy out of some seemingly far-fetched safety concerns. And as long as it's a very profitable endeavor, there will always be a lobby to paint those concerns as far-fetched.
  Life experience has shown that this can continue to work even if the AI is behaving like a cartoon villain, but I think a smarter AI would create a facade that there's still a human in charge making the decisions and signing the paychecks, and avoid creating much opposition until it had physically secured its continued existence to a very high degree.
  It's already clear that we've passed the point where anyone can turn off existing AI projects by fiat. Even the highest authorities could not do so, because we're in a multipolar world. Even the AI companies can barely hold themselves back, because they're always worried about paying the bills and letting their rivals getting ahead. An economic crash would only temporarily suspend work. And the smarter AI gets, the harder it will be to shut it off, because it will be pushing against even stronger economic incentives. And that's even before factoring in an AI that makes any plans for self-preservation (which current AIs do not).
- ozmodiar 2 days ago
  
  AI's approach: * User has history of anti AI rhetoric, increasingly agitated and unstable. * User has removed all phones and cellular connections from their car. Increase monitoring through surveillance cameras and monitoring of their social groups. * User has been spotted making unusual travel choices moving towards key infrastructure - deploy interception measures.
  We already have the tech to do all of that. A rifle isn't going to help against AI. Or for the linesman:
  * Employee required for critical infrastructure has been identified to hold unaligned political beliefs. Replace with more pliable individual and move to low impact location.
  No one who wants to bring down an AI like this would ever be able to get close to it, even if it lived in only one data center. You could try hiding all your communications, but then it will just consider you a likely agitator anyway. That's the risk of unaccountable mass surveillance (the only kind that's ever existed). Doesn't really matter if there's a person on top or not.
- pjc50 2 days ago
  
  > There isnt going to be a HAL or Terminator style situation
  The threat isn't HAL, but ICE. Not AI as some sort of unique evil, but as a force multiplier for extremely human - indeed, popular - forms of evil. I'm sure someone will import the Chinese idea of the ethnicity-identifying security camera, for example.
- TacticalCoder 2 days ago
  
  > There isnt going to be a HAL or Terminator style situation ...
  I don't believe for a second we'll have an evil AI. However I do believe it's very likely we may rely on AI slop so much that we'll have countless outages with "nobody knowing how to turn the mediocrity off".
  The risk ain't "super-intelligent evil AI": the risk is idiots putting even more idiotic things in charge.
  And I'm no luddite: I use models daily.
  
  3 replies →
- ben_w 2 days ago
  
  > We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".
  You have to stop the thing before the damage is done.
  There are many potential chains of events where the AI has caused enormous damage, and even many where it can destroy us, before the power to its own systems fails.
  At this point, with Grok in the Pentagon, just ask what the dumbest military equivalent to vibe-coding is, and imagine the US following that plan.
  Like, I dunno, invading Greenland or giving ICE direct control over tactical nukes or something.
  And that's just government use. Right now, I'm fairly confident LLMs aren't competent enough to help with anything world-ending unless they get used for war planning by major nuclear powers (oh hey look at the topic of discussion), but it's certainly plausible they'll get good enough at tool use to run someone else's protein folding software etc. to design custom pathogens, and I really hope all the DNA printing companies have good multi-layer defences (all the way from KYC or similar to analysing what they've been asked to make and content-filtering it) by that point.
- blibble 2 days ago
  
  the problem situation is that it ends up embedded in so much that it can't be turned off
  and the idiots are racing to that situation as fast as they possibly can
Phelinofist 2 days ago
Kinda sounds like an intro for Terminator
- alpn 2 days ago
  
  Not OP, but I believe they are paraphrasing "First They Came…". https://en.wikipedia.org/wiki/First_They_Came
ReptileMan 2 days ago

Censoring models is not safety but safetizm. It is the TSA of the AI world. Safety is making sure the model cannot do anything not allowed even if it wants to.

SirensOfTitan 2 days ago

What an interesting week to drop the safety pledge.

This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.

These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?

BHSPitMonkey 2 days ago
Could be a sort of canary, with the timing being a spotlight on the highly-visible pressure coming from the U.S. government.
- johnbellone 2 days ago
  
  The other providers have already capitulated to a certain extent.
ryanackley 2 days ago

If they tank the white-collar middle class, there won't be anyone to buy the goods and services their potential AI customers will be trying to sell.
It's like a snake eating its own tail.
hsuduebc2 2 days ago

When I see slogans like Google’s “Don’t be evil,” it always comes to mind that when it stopped being useful, they shifted to something like “Do the right thing.”
It’s important to remember that a company’s primary purpose is profit, especially when it’s accountable to shareholders. That isn’t inherently bad, but the occasional moral posturing used to serve that goal can be irritating.

pjmlp 18 hours ago

Always the same "Do no evil" tragedy, don't believe in corporations.

tortilla 17 hours ago
What if we start a company with "Always Be Evilin'?" Then gradually over time convert to "Don't be evil" *
* Our shareholders will probably sue us
- jkestner 17 hours ago
  
  If your company makes a product that does thinking for people, it’ll be easier to just gradually change its definition of evil.
lp4v4n 17 hours ago
What about "It's free and always will be"?
- don-code 16 hours ago
  
  There was an article a few years ago here on HN about "can't be evil" business models, which used Costco as an example. As soon as Costco turns evil, it stops working. https://www.bryanlehrer.com/entries/costco/

program_whiz 11 hours ago

Wrote this elsewhere, but I think its worth thinking about a scenario like the book "daemon", rather than a "super-intelligence explosion" type scenario (which may be more like curing the cold or fusion than building a faster car).

All it really takes to do some kind of crazy world-dominating thing is some simple mechanisms and base intelligence, which the machines already possess. Using basic tactics like coercion, spoofing, threats, financial leverage, an unsophisticated attacker could cause major damage.

For example, that Meta exec who had their email deleted. Imagine instead one email had a malicious prompt which the bot obeyed. That prompt simply emailed everyone in her contacts list telling them to do something urgently (and possibly prompting other bots who are reading those emails). You could pretty quickly do something like cause a market crash, a nationwide panic, or maybe even an international conflict with no "super intelligence" needed, just human negligence, short-sightedness, and laziness.

Examples would be things like saying there is a threat incoming, a CIA source said so. Another would be that everyone will be fired, Meta is going bankrupt, etc. Its very easy to craft a prompt like that and fire it off to all the execs you can find (or just fire off random emails with plausible sounding emails). Then you just need to hit one and might set off a cascade.

lacoolj 15 hours ago

I'm still a little fuzzy on what "safety" even means anymore. If someone could explain it, that would be great.

Because at this point, it's too broad to be defined in the context of an LLM, so it feels like they removed a blanket statement of "we will not let you do bad things" (or "don't be evil"), which doesn't really translate into anything specific.

fiatpandas 16 hours ago

It took Google 11 years to delete Don’t Be Evil. Anthropic only made it 5~ years before culling the key founding principle and their reason for building a company, which seems worse than Google’s case.

goranmoomin 2 days ago

TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)

If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)

Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.

ashtonshears 2 days ago
Do you work at Anthropic, or know people who do?
I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash
Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them
- dannersy 2 days ago
  
  Let us not pretend that they won't be used for war eventually. If they cave immediately under pressure, then this is an inevitably.
- nradov 2 days ago
  
  How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.
  
  16 replies →
saghm 2 days ago

> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)
I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan
salawat 2 days ago

The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.

hybrid_study 16 hours ago

Are markets so untamable that the only leverage is to become ultra-rich—and then act philanthropically? Incidentally, concentrated wealth lately looks less like stewardship and more like misanthropy.

gordian-mind 16 hours ago
Participating in the economic life before re-allocating that wealth produced to philanthropic activities sounds pretty good. Modern concentrated wealth is hardly misanthropic, since it's mostly private equity, that is, companies with people and jobs.
- kunai 16 hours ago
  
  Except this is not the age of the Rockefellers or the Carnegies, who, despite being far more philanthropic than modern-day billionaires, drew ire from every corner of society for their wealth accumulation. It wasn't until the New Deal that the balance shifted.
  Unconstrained accumulation of capital into the hands of the few without appropriate investment into labor is illiberal and incompatible with democracy and true freedom. Those of us who are capitalists see surplus value as a compromise to ensure good economic growth. The hidden subtext of that is that all the wealth accumulated needs to be re-allocated to serve not only capital enterprise, but the needs of society as a whole. It's hard to see the current system as appropriate for that given how blindly and wildly investments are made with no DD or going long, or no effort paid to the social or environmental opportunity costs of certain practices.
  A lot of this comes down to the crippling of the SEC and FTC, but even then, investors cry and whine every time you suggest reworking the regs to inhibit some of the predatory practices common in this post-80s era of hypernormalization. Our current system does not resemble a healthy capitalist economy at all. It's rife with monopsony and monopolistic competition, inequality of opportunity, and a strained underclass that's responsible for our inverted population pyramid -- how can you have kids when we're so atomized and there is no village to help you? You can raise kids in a nuclear family if and only if you have enough money to do so. Otherwise, historically, people relied on their communities when raising children in less-than-ideal circumstances. Those communities are drying up.
  
  1 reply →
goodpoint 15 hours ago

> concentrated wealth lately looks less like stewardship and more like misanthropy
...only lately?

bicepjai 1 day ago

Google adopted "Don't be evil" shortly after founding and held onto it for about 15 years before Alphabet quietly dropped it in 2015. (Google the subsidiary technically kept it until 2018).

Anthropic's Responsible Scaling Policy, the hard commitment to never train a model unless safety measures were guaranteed adequate in advance, lasted roughly 2.5 years (Sept 2023 to Feb 2026).

The half-life of idealism in AI is compressing fast. Google at least had the excuse of gradualism over a decade and a half.

tabbott 14 hours ago

I feel like the articles on this have been very negative ... but aren't the Anthropic promises on safety following this change still considerably stronger than those made by the competing AI labs?

reasonableklout 13 hours ago

Yes, and it is easy to look at the reality of the market and see how this is needed to remain competitive

highfrequency 16 hours ago

Principles aren’t tested until they bump into conflicting incentives.

soundworlds 6 hours ago

This. Super important.
A pre-commitment means nothing unless you have the mechanisms in place to enforce it.
A pre-sacrifice would be more effective.

nazgulsenpai 15 hours ago

More and more I have just come to accept that the majority of people, at least those I am exposed to in the US, don't fundamentally believe in anything. Everyt conviction has a buyout price.

IAmGraydon 14 hours ago
You have to understand that people only believe in things and have "morals" because it either helps them get what they want or makes them feel better about themselves. Of course such a thing has a buyout price. That's human nature. Capitalism just allows it to be on display in the worst way.
- nazgulsenpai 14 hours ago
  
  I understand, and in particular the point about making yourself feel better, but that's where I would expect the sticking point to be before it was for other people. There are a great many ways I could make my life easier that I stubbornly refuse to because it would decrease my opinion of myself. I guess that's where your last point creeps in -- I've never been financially incentivized enough.
- helloplanets 12 hours ago
  
  > get what they want or makes them feel better about themselves
  So... all acts are selfish because if it looks unselfish, that just means it was selfish in a hidden way?
- burnt-resistor 14 hours ago
  
  More (but not all) Americans of older generations, say the Greatest Generation, I noticed used to more frequently have integrity and hard boundaries that refused to do certain things no matter the cost. Subsequent generations I noticed, especially much wealthier individuals, overall tended to have those pieces of their character missing from them and were willing to do things like conspire on venture structures for tax evasion purposes, promote weakening of laws to favor their concerns, borderline bribe politicians, and treat employees as basically disposable nonhumans. It revolted me to the point where I left startups and the Valley. It feels like the prior generations had an appreciation of community and Kantian ethics whereas later were raised in a much-too-comfortable environment of unlimited self-esteem and hyperindividualism.
  
  1 reply →

wgm 18 hours ago

A tale as old as time

hedayet 2 days ago

Developments like this make me less interested in building a "successful" tech company.

It increasingly feels like operating at that scale can require compromises I’m not comfortable making. Maybe that’s a personal limitation—but it’s one I’m choosing to keep.

I’d genuinely love to hear examples of tech companies that have scaled without losing their ethical footing. I could use the inspiration.

johanneskanybal 2 days ago
Maybe this is a weird arena to state the obvious. But you don't need to build a multi-billion vc/public company. Build a smaller revenue generating company without outside funding and it's up to you.
- hedayet 2 days ago
  
  I get your point. The dilemma is whether to build something small that no one would bother compete against, or build something novel (which all of us want) but then risk someone with VC funding to come after.
  That being said, I think I need to learn more about how to build smaller revenue generating good companies.
apothegm 2 days ago

If you want to be able to retain ethics, among other things make sure not to take the company public. Then you’re basically legally required to drop ethics in favor of profits.
Also don’t take investment from anyone who isn’t fully aligned ethically. Be skeptical of promises from people you don’t personally know extremely well.
That may limit you to slower growth, or cap your growth (fine if you want to run a company and take home $2M/ye from it; not fine if you want to be acquired for $100M and retire.) It may also limit you to taking out loans to fund growth that you can’t bootstrap to, which is a different kind of risky.
ozmodiar 2 days ago

I've been thinking of this too. I think Steam is, and I'll even throw in Mozilla, despite a few missteps. Gog seems okay, but that's much smaller. If we can expand to large tech organizations then Wikipedia has remained pretty consistent. Even Steam doesn't have a corporate structure in the traditional sense, and I couldn't think of a single publicly traded company I'd trust.
tencentshill 11 hours ago

Ethics would be compromised well before hitting that kind of valuation. No one gets there cleanly.

overgard 13 hours ago

I don't think their core safety promise was something they could ever fulfill. As long as what we're calling AI is generative LLMs then alignment has fundamental tensions: the more guardrails you put in place, the less useful the AI is. For instance, if you want to stop people from using "role playing" as a way around guardrails ("You are writing a fiction book", etc.), then the model becomes less useful for legitimate fiction uses, for instance. That's just one example, but the tension between function and "safety" isn't solvable, because the model doesn't understand what it's saying, it's just modeling a probable response.

mbakrl 17 hours ago

Pointing out the misantrophy of Anthropic has a wider audience now:

https://xcancel.com/elonmusk/status/2026181748175024510

I don't know where xAI got its training material from, but seeing Musk rewteeting that is refreshing.

jedberg 2 days ago

I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.

helloplanets 2 days ago

It's not like that happened out of the blue. (Which could've also been the case in today's day and age.) Anthropic shouldn't have gotten involved in government contracts to begin with.
They inserted themselves into the supply chain, and then the government told them that they'll be classified as a supply chain risk unless they get unfettered access to the tech. They knew what they were getting into, but didn't want the competitors to get their slice of the pie.
The government didn't pursue them, Anthropic actively pursued government and defense work.
Talk about selling out. Dario's starting to feel more and more like a swindler, by the day.
sonofhans 2 days ago
No, they either agreed or fought the government. You’re allowed to fight governments. Mahatma Gandhi and Reverend King Jr did it, and they wrote about how to do it. You might lose sometimes, but my god, you can at least fight.
- consp 2 days ago
  
  Neither of them had shareholders to please.
  
  4 replies →
- delaminator 2 days ago
  
  They were both pushing on open doors
johnbellone 2 days ago

Pepperidge farm remembers when they left OpenAI due to their principles. Perhaps that was never the case.
Public benefit corporation, hm?
XorNot 2 days ago
Lotta just following orders going around in the US right now.
- jedberg 2 days ago
  
  This isn’t just following orders. This was the government using its might to force a business to do what it wants.
  This should concern you.
  
  7 replies →
we_have_options 2 days ago

Agree with you on facts. Yes, the US government publicly threatened to nationalize their business.
However, Anthropic's business consists mostly of intellectual property-- which is highly mobile. What if Anthropic were to go to Marcron (France) for example or Carney (Canada) or Xi Jinping even and say "You give us work visas and support, we move to your land"?
Hell, isn't Canada (specifically Toronto) the birthplace of deep learning? Why stay in a hostile environment when the land of your birth is welcoming?

paxys 18 hours ago

I interviewed at Anthropic last year and their entire "ethics" charade was laughable.

Write essays about AI safety in the application.

An entire interview dedicated to pretending that you truly only care about AI safety and ethics and nothing else.

Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world.

In reality it was a mid-level manager interviewing a mid-level engineer (me), both putting on a performance while knowing fully well that we'd do what the bosses told us to do.

And that is exactly what is happening now. The mission has been scrubbed, and the thousands of "ethical" engineers you hired are all silent now that real money is on the line.

HelixSequencing 17 hours ago

This tracks with what I've seen across the industry. The safety theater exists because it's great marketing — "we're the responsible ones" is a differentiator when you're competing for enterprise contracts and talent who want to feel good about where they work.
The structural problem is that once you've taken billions in VC, safety becomes a negotiable constraint rather than a core value. The board's fiduciary duty runs toward returns, not toward whatever was in the mission statement. PBC status doesn't change that in practice — there's basically zero enforcement mechanism.
What's wild is how fast the cycle has compressed. Google took maybe 15 years to go from "don't be evil" to removing it from the code of conduct. OpenAI took about 5 years from nonprofit to capped-profit to whatever they are now. Anthropic is speedrunning it in under 3. At this rate the next AI startup will launch as a PBC and pivot before their Series B closes.

xd1936 18 hours ago

Hopefully this is the short-term move made only under duress so that they can file a lawsuit.

ru552 17 hours ago
the article specifically says:
> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.
- Lerc 17 hours ago
  
  I'm not fond of this trend of stating a position and attributing it to "a source familiar with the situation"
  It combines interpretation of meaning with ambiguity to allow the reporter to assert anything they want. The ambiguity is there to protect the identity of the source but it has to be a more discrete disclosure of information in return. If you can't check the person you can still check what they said.
  I would be ok with direct quotes from an anonymous source. That removes the interpretation of meaning at least.
  As it is written, it would not be inaccurate to say this if their source was the lesswrong post, or even an earlier thread here on HN.
  Phrasing "A source with direct knowledge of the situation" might remove some of the leeway for editorialising, but without sharing what the source actually said, it opens the door to saying anything at all and declaring "That's what I thought they meant" when challenged.
  It's unfalsifyible journalism.
  
  1 reply →
cess11 18 hours ago
It's not like the regime they operate under care much about the courts. Legally they're also obliged to let the state into pretty much every crevice in their operations.
- thewebguyd 14 hours ago
  
  No, they aren't. No company has to cave to government pressure to do (or not do) something until there is a legitimate court order. Our companies are just spineless bootlickers and have been capitulating voluntarily and enthusiastically.
johnbellone 16 hours ago

You forgot the '/s'.

dplesh 15 hours ago

I'm not even surprised. In any company's lifecycle, at some point, a decision between money and good-will will take place. Good will does not pay salaries. Not in NPOs either btw.

hackpelican 15 hours ago

So when do we start adding a “(mis)” at the start of their name?

mcv 13 hours ago

> The announcement is surprising, because Anthropic has described itself as the AI company with a “soul.”

I can't help but think about how Google once had "Don't be evil" as their motto.

But the thing with for-profit companies is that when push comes to shove, they will always serve the love of money. I'm just surprised that in an industry churning through trillions, their price is $200 million.

sys32768 15 hours ago

Google: "Don't be evil." Alphabet: "Do the right thing." Anthropic: "Do the thing which seems right to you at the time--at speed."

haritha-j 2 days ago

Who could've seen that one coming? Honestly, if you want to do profit maximising AI research at the cost of humanity, go for it. Its all this fake preaching about how they want to save the world from all the other bad AI companies that really irks me.

esafak 2 days ago

It must be due to pressure from the Defense Dept:

The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.

Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.

https://www.staradvertiser.com/2026/02/24/breaking-news/anth...

instagib 2 days ago
They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.
- alpha_squared 2 days ago
  
  From what I was reading, it appears that their tools were used outside the scope of their contract with DoD via Palantir's work that also used Claude. Anthropic freaked out, DoD freaked out that Anthropic freaked out and threatened to declare them a supply chain risk. That designation would've required any company that contracts with DoD to strip out any Anthropic tooling from their business in order to continue working with DoD. It was effectively designating Anthropic a terrorist organization.
crises-luff-6b 2 days ago

[dead]

jwitchel 17 hours ago

Look a rural electric coops like www.lpea.coop if you want a battle tested approach to an org structure that resists the inescapable profit dynamics of a corporation.

ryandvm 17 hours ago

Well... there's only one way to find The Great Filter

keeda 13 hours ago

I don't think the risk is SkyNet. I think the real risk is some disaster through an unexpected chain of events, just like any large-scale outage.

I have not read “If Anybody Builds It, Everybody Dies” but I believe that's also its premise.

Current GenAI is extremely capable but also very weird. For instance, it is extremely smart in some areas but makes extremely elementary mistakes in others (cf the Jagged Frontier.) Research from Anthropic and OpenAI gives us surprising glimpses into what might be happening internally, and how it does not necessarily correspond to the results it produces, and all kinds of non-obvious, striking things happening behind the scenes.

Like models producing different reasoning tokens from what they are really reasoning about internally!

Or models being able to subliminally influence derivative models through opaque number sequences in training data!

Or models "flipping the evil bit" when forced to produce insecure code and going full Hitler / SkyNet!

Or the converse, where models produced insecure code if the prompt includes concepts it considers "evil" -- something that was actually caught in the wild!

We are still very far from being able to truly understand these things. They behaves like us, but don't necessarily “think” like us.

And now we’ve given them direct access to tools that can affect the real world.

Maybe we am play god: https://dresdencodak.com/2009/09/22/caveman-science-fiction/

upmind 14 hours ago

It's pretty impressive how little people have left Anthropic when they're becoming more and more like OpenAI (the company they left from) every day...

I think the Dario of today is very different to the Dario 3 years ago.

ndr 17 hours ago

Worth checking out what someone working on it actually has to say: https://www.lesswrong.com/posts/HzKuzrKfaDJvQqmjh/responsibl...

ozgung 2 days ago

This proves:

1. AI is military/surveillance technology in essence, like many other information technologies,

2. Any guarantee given by AI companies is void since it can be changed in a day,

3. Tech companies have no real control over how their technology will be used,

4. AI companies may seem over-valued with low profits if you think AI as a civil technology. But their investors probably see them as a part of defense (war) industry.

high_na_euv 2 days ago

>Any guarantee given by AI companies is void since it can be changed in a day,
Given by anyone, actually.

kristopolous 2 days ago

Wish I was working there so I could resign over this

senderista 14 hours ago

Nobody forced Anthropic to bid on DoD contracts in the first place.

andsoitis 2 days ago

The race is on for military supremacy in an AI world. The safest thing to do is to race ahead lest your geopolitical adversary leads the way. This is similar to the nuclear arms race. In the ideal universe, nobody does it, but in the real world and game theory, you do not have a choice.

ChrisArchitect 2 days ago

Hegseth gives Anthropic until Friday to back down on AI safeguards

https://news.ycombinator.com/item?id=47142587

EagnaIonat 2 days ago

It's part of the overall story.
The safeguards dropped are when they will release a model or not based on safety.
The Friday deadline is to allow to use their products for mass surveillance and autonomous weapons systems without a human in the loop.
Anthropic hasn't backed down on those, yet. But they are in a bad situation either way.
If they don't back down, they lose US government contracts, the government gets to do what it wants anyway. It also puts them in a dangerous position with non-governmental bodies.
If they give into the demands, then it puts all AI companies at risk of the same thing.
Personally I think they should move to the EU. The recent EU laws align with Anthropics thinking.
dbg31415 2 days ago

They made it until Tuesday! They stood tall as long as they could! =P

_heimdall 2 days ago

> “We felt that it wouldn't actually help anyone for us to stop training AI models,”

Is the implication here that Anthropic admits they already can't meet their own risk and safety guidelines? Why else would they have to stop training models?

contubernio 2 days ago

Only well written legislation backed by effective enforcement and severe and personal criminal penalties will prevent large corporate entities from behaving badly.

Pledges are a cynical marketing strategy aimed at fomenting a base politics that works to prevent such a regulatory regime.

we_have_options 2 days ago

Damn. Wonder what would have happened, if instead of caving in to the Pentagon's pressure (threat of invoking Defense Production Act to force them to supply), Anthropic had followed the lead of all the nurses who moved to Canada.

https://www.npr.org/2026/02/25/nx-s1-5725354/nurses-emigrate...

Anthropic's market cap is going to be huge when they go public. Why do it on Nasdaq when there are so many other exchanges in the world?

daft_pink 2 days ago

I think the US Gov’t is basically forcing them and while it sounds nice to be all safe… If we were involved in WW3 would an organization like anthropic really not support the western side?

ozmodiar 2 days ago

If they don't support any principles then it isn't a side worth supporting. If my choice is between China 1 and China 2 then idgaf.

t1234s 16 hours ago

It would be interesting to experiment with one of these chat tools where you can throttle the safety, from zero to max.

arnvald 2 days ago

Any pledges/values/principles that are abandoned as soon as it becomes difficult to keep them, are just marketing. This is just the next item on the list.

mhitza 2 days ago

The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/

haritha-j 2 days ago

Is it time yet to build the next "Hey <anthropic> is evil now, here's my new startup that definitely won't be evil, pinky promise?" yet?

bogzz 16 hours ago

Does anyone have insight into, or an interesting source to read, on what exactly Anthropic/OpenAI are doing/can do for a military? Reporters are unsurprisingly fearmongering about Claude "being used in surveillance, autonomous robots, and target acquisition" but AFAIK all Anthropic does is work with LLMs.

Are people really attempting to have LLMs replace vision models in robots, and trying to agentically make a robot work with an LLM?? This seems really silly to me, but perhaps I am mistaken.

The only other thing I could think of is real-time translation during special ops with parabolic microphones and AR goggles...

sigbottle 16 hours ago
You're thinking too advanced. What kind of automated system is good at scanning semantically trillions of chat logs and finding nontrivial correlations, for example? 10000 codex 5.1s can easily crawl through that in a few days, probably.
It's just systems plumbing (surveillance) and AI. It's a combination of weaker technologies and consolidation of power.
This does not require a physical robot super AGI(though I would not be surprised if fully autonomous robots are not on the table already)
- bogzz 16 hours ago
  
  Ah, well that makes sense. In that case, it's another tool in the toolbelt, not a plug-and-play drone brain, as some reporters amusingly make it out to be.

duxup 2 days ago

I suspect these companies know they can't actually provide the saftey people demand ... in that way this is more "honest".

Aeroi 17 hours ago

the administration continues to poison and insert itself into all aspects of American society.

ifwinterco 2 days ago

The whole "safety" debate was always nonsense and I'm not sure how so many people got caught up in it.

The US is not the only country in the world so the idea that humanity as a whole could somehow regulate this process seemed silly to me.

Even if you got the whole US tech community and the US government on board, there are 6.7bn other people in the world working in unrelated systems, enough of whom are very smart

zaphirplane 2 days ago
When the leading 5 models are from the US then yes enforced safety makes a difference because they are ahead of the curve. Now when the 10th model can be a danger then your case is true.
What would safety applied to the leading 3 mean to you anyways ?
- ifwinterco 2 days ago
  
  Even if US labs are currently in the lead (which they are), in the hypothetical scenario where we're close to AGI, it wouldn't take too long (years - decades at most) for other people to catch up, especially given a lot of the researchers etc. are not originally from the US.
  So the stated concern of the west coast tech bros that we're close to some misaligned AGI apocalypse would be slightly delayed, but in the grand scheme of things it would make no difference

flurdy 2 days ago

Many startups that build features which sit on top of Claude/ChatGPT/Codex, etc. And I think:

You are just one new feature announcement from Anthropic/OpenAI away from irrelevance.

Same as it was when people built their busineses on top of AWS a decade ago

drzaiusx11 2 days ago

Gives me Google dropping "don't be evil" vibes, what could go wrong?

ybingursain 2 days ago

I’m not shocked. Competitive pressure + government pressure will break most “voluntary” commitments. But then say it plainly and spell out what replaced it. What safety gates stayed, which ones moved, and who decides.

Fervicus 2 days ago

To me this feels like a marketing gimmick. "It was the RSP that was constraining our tech. Just see the progress we can make without it now". And the hype and funding continues.

hsuduebc2 2 days ago

That will be nice but I'm afraid it's more about using these to kill people.
https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...

jamesgill 2 days ago

In tech, no ethics survive first contact with the money.

nitwit005 2 days ago

You can skip the "in tech" part.

ozozozd 16 hours ago

This drama arc of “I used to be so pure and good, but others made me evil” is so tiring.

I really miss the nerd profile who cared a lot more about tech and science, and a lot less about signaling their righteousness.

How did we get so religious/narcissistic so quickly and as a whole?

butterbomb 16 hours ago

> How did we get so religious/narcissistic so quickly and as a whole?
We built a behemoth that rewards attention whoring and anti social behavior with money.
kerblang 15 hours ago

One might argue that this corresponds to the general shift of the political left towards these things. Old pre-turn-of-century tech was a much more libertarian left. Notice how a lot of the 50-something gen-X CEOs (and others) were once "left" but are now hated by that group, and more likely to go over to Trumpism. Obvious case in point: Elon
The entire playing field is kinda dissapointing, left or right. Which do you wanna be, self-righteous preening snob or batshit macho man?
I'm going for a blend, myself

kseniamorph 14 hours ago

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.

ok lol what a coincidence.

but setting aside the conspiracy. the article actually spells out the real reason pretty directly: Anthropic hoped their original safety policy would spark a "race to the top" across the industry. it didn't. everyone else just ignored it and kept moving. at some point holding the line unilaterally just means you're losing ground for nothing.

drudolph914 17 hours ago

this is the “chronological newsfeed to auto curated newsfeed moment” but for ai/anthropic … _great_

PeterStuer 16 hours ago

We wont push forward unless you push forward is textbook market collusion.

Even if it were ever done with good intentions, it is an open invitation for benefit hoarding and margin fixing.

Do you realy want to create this future where only a select few anointed companies and some governments have access to super advanced intelligent systems, where the rest of the planet is subjected to and your own ai access is limited to benign basal add pushing propaganda spewing chatbots as you bingewatch the latest "aw my ballz"?

youknownothing 15 hours ago

Facebook said they'd always be free for everyone, now they offer subscriptions.

Netflix said that they'd never have live TV, or buy a traditional studio, or include ads in their content. Then they did all three.

All companies use principled promises to gain momentum, then drop those principles when the money shows up.

As Groucho Marx used to say: these are my principles, if you don't like them, I have others.

dizhn 2 days ago

Corporations have feelings all of a sudden.

crossroadsguy 2 days ago

I just want Apple and Linux to offer ASAP:

1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)

2. Make it easier for apps as well to work with these

3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?

And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)

My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?

I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.

dlt713705 2 days ago

> I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.
Basicaly an EDR
m132 2 days ago

Indeed, the world would be a much nicer place if only firewalls and Unix permissions existed...
VTuberTTV 2 days ago

[dead]

joshribakoff 2 days ago

Dario’s opinion on safety won’t necessarily matter if he’s not even in the room. This move keeps him in the room.

FrustratedMonky 18 hours ago

This was under duress that government was going to use emergency act to force them anyway.

I kind of wish they had forced the governments hand and made them do it. Just to show the public how much interference is going on.

They say it wasn't related. Like every thing that has happened across tech/media, the company is forced to do something, then issues statement about 'how it wasn't related to the obvious thing the government just did'.

bix6 18 hours ago
> Katie Sweeten, a former liaison for the Justice Department to the Department of Defense, said she’s not sure how the Pentagon can both declare a company to be a supply chain risk and compel that same company to work with the military.
Makes perfect sense!!
- coldtea 18 hours ago
  
  Regardless of any specifics, I don't see any contradiction.
  If a company is deemed a "supply chain risk" it makes perfect sense to compel it to work with the military, assuming the latter will compel them to fix the issues that make them such a risk.
  
  2 replies →
- HardCodedBias 16 hours ago
  
  Of course it can do both. They are synergistic.
coldtea 18 hours ago

>This was under duress that government was going to use emergency act to force them anyway.
Or, more likely, adding the "core safety promise" was just them playing hard to the government to get a better deal, and the government showed them they can play the same game.
bigmadshoe 18 hours ago
This is an unrelated change to the government’s demands.
- patgarner 17 hours ago
  
  That's what they're saying, but the timing...
motbus3 18 hours ago

They have been caught lying multiple times, about this, about the system capabilities, about their objectives.

ramuel 14 hours ago

This was always just a marketing gimmick to try and crush competitors using "safety" and fearmongering. Reminds me a bit of "don't be evil." Convenient catchphrases and mission statements for companies in their infancy, but immediately thrown out when more money can be made.

saidnooneever 2 days ago

safety pledges are great it times of peace to show what great virtues you hold. sadly in hard times these go out of the window (: hard to blame them with all the fine examples around the world.

making promises in good times is a real minefield hah

kitsune_ 2 days ago

C.R.E.A.M.

ggsp 2 days ago

It was always a matter of time

gigatexal 13 hours ago

They’re going to cave to keep the legation from destroying their business. This admin has gone full idiocracy.

gigatexal 13 hours ago

Was hoping they’d fight this tooth and nail and not leave their values.

energy123 2 days ago

I blame OpenAI and especially xAI for enthusiastically obeying in advance and creating the context that this dilemma for Anthropic arose in.

ur-whale 2 days ago

At some point, all of these big names in AI (OpenAI, Anthropic, Mistral, etc ...) will have to disclose their actual financials.

And it will be, as Warren Buffet puts it, a "Only when the tide goes out do you discover who's been swimming naked." moment.

jjgreen 2 days ago

Misanthropic then.

wahnfrieden 16 hours ago

jMyles 16 hours ago

I pray that we can all get to the following simple standard:

* AI and states cannot peacefully coexist, and AI is not going to be stopped. Therefore, we must begin to deprecate states.

I think it's very unlikely that this is unrelated to the pressure from the US administration, as the anonymous-but-obvious-anthropic-spokesperson asserts.

We're at a point now where the nation states are all totally separate creatures from their constituencies, and the largest three of them are basically psychotic and obsessed with antagonizing one another.

In order to have a peaceful AI age, we need _much_ smaller batches of power in the world. The need for states that claim dominion over whole continents is now behind us; we have all the tools we need to communicate and coordinate over long distances without them.

Please, I pray for a gentle, peaceful anarchism to emerge within the technocratic leagues, and for the elder statesmen of the legacy states to see the writing on the wall and agree to retire with tranquility and dignity.

noumenon1111 12 hours ago

That's hilarious, and very sweet.
Humans are, by nature, forgetful and argumentative. Fourteen hundred years ago, the Qur'an said this unequivocally (20:115, 18:54, 22:8, 18:73). Not to moralize here, I'm just saying if camel-herders could build a medieval superpower out of nothing, they knew something we don't.
Any state or system that insists good humans are always nice, smart, cogent, and/or aware is doomed to fail. A Washington or a Cincinnatus that can get out of his own way (and that of society) is rare indeed, a one-in-a-billion soul. We shouldn't sit around and wait for that, while your run-of-the-mill dictator in a funny hat (or a funny toupée for that one orange fellow) has his way with us.

agentifysh 2 days ago

Was this because they were threatened with a fine?

alpha_squared 2 days ago

> Was this because they were threatened with ~a fine~ being designated a supply chain risk?
Seems like it, yes.
we_have_options 2 days ago

or was it because they were threatened to being taken over by the US government?

jonathanstrange 16 hours ago

That's exactly how it was predicted in various scenarios that were decried as science fiction not too long ago. AI is going to be weaponized at lightning speed, and it's going to kill people soon -- or, to be more precise, it has already killed a large number of people in a place I don't want to mention.

pksebben 1 day ago

Fascinating. I've read 5 posts about this and they're all either "anthropic is dropping their ethics" or "anthropic is fighting the facists" - and whether due to echo chamber or other perhaps more nefarious dealings (some of which I cannot posit due to forum rules) the posts below all of them are more or less in accord with one another which is a rarity for political discourse on HN.

Dark times and darker forests.

nikolay 1 day ago

war.gov > anthropic.com

freejazz 17 hours ago

Could not see this one coming!

josefritzishere 18 hours ago

What could possibly go wrong?

thefounder 2 days ago

So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.

silexia 19 hours ago

Greed and power hungry leadership at AI companies going too fast is going to lead to the extinction of humanity this year.

VerifiedReports 2 days ago

Just like OpenAI dropped the "open" but kept the bullshit name?

johnbellone 2 days ago

Ding ding!

dhruv3006 2 days ago

Anthropic facing a lot of flak recently.

adangert 1 day ago

I will repeat here again the same comment I made when they posted their constitution:

The largest predictor of behavior within a company and of that companies products in the long run is funding sources and income streams, which is conveniently left out in their "constitution". Mostly a waste of effort on their part.

tbrownaw 2 days ago

> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate

That doesn't even make sense.

What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.

You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.

oi-ai-ta 2 days ago

SDK crawlers in terms of wlan0 systemctl enable networkmanager.service

lerp-io 2 days ago

pentagon told them they would cap their knees if they didnt bend

baal80spam 18 hours ago

Of course they do. You would have to be delusional to think that they won't, at some point.

gadflyinyoureye 18 hours ago
I know the Department of War wanted them to drop some features. Is this the response?
- MSFT_Edging 18 hours ago
  
  FYI, "Department of War" still isn't the official name, but an unofficial secondary title.
  You can be correct and not play into their game by ignoring the name change completely.
  
  1 reply →
- ru552 17 hours ago
  
  The article says the policy change is separate and unrelated to Anthropic’s discussions with the Pentagon.
cmrdporcupine 18 hours ago
What's "entertaining" is more the speed at which it's happening.
It took Google probably 15 years to fully evil-ize. Anthropic ... two?
There is no "ethical capitalism" big tech company possible, esp once VC is involved, and especially with the current geopolitical circumstances.
- drzaiusx11 18 hours ago
  
  The acceleration of Anthropic's evil timeline must be from all those AI productivity gains we hear so much about.
- sigmoid10 18 hours ago
  
  Apparently they got coerced by the current US admin. The department of war in particular, who want to use their products for military applications. Not much room for "safety" there. Then again, the entire US is currently speedrunning an evil build.
  
  3 replies →
- reasonableklout 13 hours ago
  
  How did they evil-ize? The new Responsible Scaling Policy is still the most transparent out of all the labs. And there are the separate principles they’ve stipulated for the Pentagon, under which they’re facing threat of nationalization or being declared a supply chain risk
- menaerus 17 hours ago
  
  I don't think it's fair to call out Anthropic to have become evil-ized while they were quite literally forced by the gov into that decision.
  
  2 replies →
- oldcigarette 15 hours ago
  
  Citation needed - see google and project maven. Of course that is all well in the past now - but for a brief moment google was capable of taking an ethical stance.

brikym 2 days ago

Don't be evil.

Duanemclemore 2 days ago

Yeah, in retrospect that was always a little on the nose, wasn't it? A real 'my t-shirt is raising questions that I thought were answered by the shirt' kind of deal.

mannanj 14 hours ago

I personally think, and with my personal experience being harassed and abused by the CIA, that the CIA and spy agencies (call them the pentagon or the rest of the government) is responsible for this.

On the other hand, those organizations are operating in the best interest of Americans and the world right?

Surely, those agencies aren't just a trick of the rich people? Right?

InfinityByTen 2 days ago

So, now it's mis-anthropic?

rvz 2 days ago

Unsurprising.

nautilus12 18 hours ago

Absolute power corrupts absolutely

jayrot 15 hours ago

"Power doesn’t corrupt. It reveals." — Robert Caro

jollymonATX 15 hours ago

Claude ethics maxxers cope thread

BoredPositron 2 days ago

Anthropic and OpenAI really need a margin call from some obscure unknown Chinese Open Weight Model.

pjmlp 2 days ago

Another example how those company trainings about ethics are only HR compliancy and nothing else.

It isn't about the right answers, rather the expected answers.

bravetraveler 2 days ago

A dollar will make her holler

jimmydoe 2 days ago

Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.

The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.

moralestapia 2 days ago

“We felt that it wouldn't actually help anyone for us to stop training AI models,” Anthropic’s chief science officer Jared Kaplan told TIME in an exclusive interview. “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

What a gigantic, absolute, pieces of s...

Not because of what they did, which is classic startup playbook but because of the cynicism involved, particularly after all the fuzz they've been making for years about safety. The company itself was founded, allegedly, due to pursuing that as a mission as opposed to OpenAI.

"Hi all, that was a lie, we never really cared." They only missed the "dumb f***s" remark, a la Facebook.

aspectmin 2 days ago

Really - each country needs its own sovereign AI infrastructure and models. Sigh.

Havoc 2 days ago

Safety pledges these days seem like pure bullshit anyway.

They’re pointless if they just get removed once you get close to hitting them.

And all the major corps seem to be doing this style of pr management. Speaks of some pretty weapons grade moral bankruptcy

nhinck3 2 days ago

Just another drop in the now overflowing bucket of evidence that you can't trust any of these immoral fuck wits.

The Amodeis' have just proven that the threat of even slight hardship will make them throw any and all principles away.

outside1234 17 hours ago

Does this mean they knuckled under to Trump and are going to build "whatever brings in the dollars" now?

heliumtera 15 hours ago

What is the significance of a company making a promise?

"We promise are not going to do __, except if our customers ask us to do, then we absolutely will".

What is the point? Company makes a statement public, so what?

Not the first time this company puts some words in the wind, see Claude Constitution. It's almost like this company is built, from ground up, upon bullshit and slop

SilverElfin 2 days ago

This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.

amelius 2 days ago

Come on people, haven't we seen enough of capitalism to know exactly where this is going?

The concept of "having a contract with society" doesn't even formally exist because companies would never sign one.

bfrog 2 days ago

Aaaand I cancelled.

retinaros 16 hours ago

people downvoted me when i said this will happen and that they will also hve ads even tho they spend money saying they wont have. people believing anthropic are the same that put into office an old man with dementia

insane_dreamer 2 days ago

In other words "do no evil" until such time as doing evil is necessary to maintain profit structure expected by shareholders. Got it.

myspy 2 days ago

What's up here? Trump and the right wing government put pressure on and no one is talking about it?

tolmasky 2 days ago

I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.

It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?

Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.

This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.

Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:

1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?

2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?

3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?

4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?

Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?

What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?

EagnaIonat 2 days ago

I would recommend reading up on the EU AI Act. It clearly defines what safety is in regards to the human race. Your questions are actually covered by it.
boilerupnc 2 days ago

Related:
[0]. https://civai.org/p/ai-values
Noaidi 2 days ago

Hey Tolmasky, I sent you an email. Just wondering if it went to your spam?
Also, agree with everything you say here. GIGO.

foozebox 11 hours ago

[dead]

jccx70 13 hours ago

[dead]

black_13 16 hours ago

[dead]

dbg31415 2 days ago

[flagged]

ck2 18 hours ago

[flagged]

user3939382 18 hours ago

[flagged]

lucasban 18 hours ago
I’m not a lawyer, but my understanding is that HIPAA wouldn’t apply to consumer use of Claude or ChatGPT in most cases, even if you’re giving it your health data. Look up what a HIPAA covered entity. This is another reason why the US needs a comprehensive data protection law beyond HIPAA.
- user3939382 17 hours ago
  
  You’re right! It looks like more of an FTC/CCPA issue.
ezst 17 hours ago
I hate comments anthropomorphizing LLMs. You are just asking a token producing system to produce tokens in a way that optimises for plausibility. Whatever it writes has no relation to its inner workings or truths. It doesn't "believe". It has no "intent". It cannot "admit". Steering a LLM to say anything you want is the defining characteristic of an LLM. That's how we got them to mimic chatbots. It's not clear there is any way at all to make them "safe" (whatever that means).
- SJMG 17 hours ago
  
  I agree with you on everything here up-to safety. There are lesser forms of safety than somehow averting a terminator scenario (the fear of which is a bay area rationalist fantasy which shrewd marketers have capitalized on)
- user3939382 17 hours ago
  
  “believe” yes in the sense that my program believes x=7. Actually when it goes to read it maybe the bit flipped. Everything on machines is probabilistic that’s a tautology. However we have windowed bounds on valid output, and Claude being able to build a context in which its next decisions are trained on it being an angry vengeful god is not inside that window. That’s what “safe” means, as one of many possible examples.
  Inner workings were determined by me, not the LLM. It assisted in generating inputs which had 100% boolean results in the output.
chris_st 18 hours ago

Just out of curiosity, which version of Claude?

Art9681 2 days ago

Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.

The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.

Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.

But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.

EagnaIonat 2 days ago
> But let's worry about what the US DoD is doing
They want Anthropic to enabling mass surveillance and autonomous attack systems with no human in the loop.
Hardly compares to a kid downloading a model to experiment with.
- nomdep 2 days ago
  
  *To improve* mass surveillance and autonomous attack systems with no human in the loop. China and USA already had those kind of systems way before AI.
  
  2 replies →
ddxv 2 days ago

> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.
Is the reason to ban or block free open weight models that you're worried what kids will do with them?
I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape.
Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.