← Back to context

Comment by 9cb14c1ec0

1 day ago

Now, its very possible that this is Anthropic marketing puffery, but even if it is half true it still represents an incredible advancement in hunting vulnerabilities.

It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes. My assumption has been for years that companies like NSO Group have had automated bug hunting software that recognizes vulnerable code areas. Maybe this will level the playing field in that regard.

It could also totally reshape military sigint in similar ways.

Who knows, maybe the sealing off of memory vulns for good will inspire whole new classes of vulnerabilities that we currently don't know anything about.

You should watch this talk by Nicholas Carlini (security researcher at Anthropic). Everything in the talk was done with Opus 4.6: https://www.youtube.com/watch?v=1sd26pWhfmg

  • Just a thought: The fact that the found kernel vulnerability went decades without a fix says nothing about the sophistication needed to find it. Just that nobody was looking. So it says nothing about the model’s capability. That LLMs can find vulnerabilities is a given and expected, considering they are trained on code. What worries me is the public buying the idea that it could in any way be a comprehensive security solution. Most likely outcome is that they’re as good at hacking as they’re at development: mediocre on average; untrustworthy at scale.

    • Regardless of how impressive you find the vulnerabilities themselves, the fact that the model is able make exploits without human guidance will enable vastly more people to create them. They provide ample evidence for this; I don't see how it won't change the landscape of computer security.

      1 reply →

    • People have, of course, been looking. Linux has been the #1 corpus for the methods for ages.

    • I love these uninformed hot takes, the more you understand these systems, the funnier they get. Stop imagining and start engineering, you’ll see what I mean. Your vision of this tech is clearly shaped by blog posts. Go build stuff with it

      1 reply →

Apple has already largely crushed hacking with memory tagging on the iPhone 17 and lockdown mode. Architectural changes, safer languages, and sandboxing have done more for security than just fixing bugs when you find them.

  • If what you are saying is true, then you would see exploit marketplaces list iOS exploits at hundreds of millions of dollars. Right now a cursory glance sets the price for zero click persistent exploit at $2m behind Android at $2.5m. Still high, and yes, higher than five years ago when it was around $1m for both, but still not "largely crushed". It is still easy to get into a phone if you are a state actor.

    • Hi, would you mind explaining how this works? Something is finding an exploit in Android/iOS and then he sells it for 2.5m/2m on some dark market?

      3 replies →

  • As I understood it, Memory Integrity Enforcement adds an additional check on heap dereferences (and it doesn’t apply to every process for performance reasons). Why does it crush hacking rather than just adding another incremental roadblock like many other mitigations before?

    • I'm not certain there is a performance hit since there is dedicated silicon on the chip for it. I believe the checks can also be done async which reduces the performance issues.

      It also doesn't matter that it isn't running by default in apps since the processes you really care about are the OS ones. If someone finds an exploit in tiktok, it doesn't matter all that much unless they find a way to elevate to an exploit on an OS process with higher permissions.

      MTE (Memory Tagging Extension) is also has a double purpose, it blocks memory exploits as they happen, but it also detects and reports them back to Apple. So even if you have a phone before the 17 series, if any phone with MTE hardware gets hit, the bug is immediately made known to Apple and fixed in code.

      1 reply →

  • Lockdown mode is opt-in only though

    • It is, but if you are the kind of person these exploits are likely to target, you should have it on. So far there have been no known exploits that work in Lockdown Mode.

      4 replies →

The interesting selling point about this, if the claims are substantial, is that nobody will be able to produce secure software without access to one of these models. Good for them $$$ ^^

  • Until someone in the PRC distills DeepSeek Security++ from them and lets anyone download it.

  • Well, except that they're giving away a huge sum of compute to other big tech firms apparently for free?

    • No one said free.

      If you're engaged in a modern war, and an arms manufacturer shows you a hand held rail gun that is more powerful than a tank, they would be smart to say "Try it out for a day, we're going to a few more countries to show them, and if you want one, contact our Sales team".

      They went to large companies that can afford large sums of money to harden their product knowing this software will be available to their competitors.

Business idea for Anthropic: What if they provided (likely costly) audits, without providing access to the model?

its very possible that this is Anthropic marketing puffery

It isn't.

  • Two possibilities:

    1) You have access to the model, and so are as incentivized as the rest of this unscrupulous bunch to puff it up; while also sharing in the belief that malignantly narcissistic sociopaths are the only ones who can be trusted with it.

    2) You lack access to the model, and are just doing more PR puffery.

    • I'm going with (3) I've been working in software security for over 20 years, I've seen what this model produces, and I know what I'm talking about.

> It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes.

It will likely cause some interesting tensions with government as well.

eg. Apple's official stance per their 2016 customer letter is no backdoors:

https://www.apple.com/customer-letter/

Will they be allowed to maintain that stance in a world where all the non-intentional backdoors are closed? The reason the FBI backed off in 2016 is because they realized they didn't need Apple's help:

https://en.wikipedia.org/wiki/Apple%E2%80%93FBI_encryption_d...

What happens when that is no longer true, especially in today's political climate?

  • Big open question what this will do to CNE vendors, who tend to recruit from the most talented vuln/exploit developer cohort. There's lots of interesting dynamics here; for instance, a lot of people's intuitions about how these groups operate (ie, that the USG "stockpiles" zero-days from them) weren't ever real. But maybe they become real now that maintenance prices will plummet. Who knows?

  • I assume that right now some of the biggest spenders on tokens at Anthropic are state intelligence communities who are burning up GPU cycles on Android, Chromium, WebKit code bases etc trying to find exploits.

Why wouldn't it be true? The cost is nothing compared to the bad PR if a bad actor took advantage of Anthropic's newest model (after release) to cause real damage. This gets in front of this risk, at least to some extent.

Yesterday, I took a web application, downloaded the trial and asked AI to be a security researcher and find me high and critical severity bugs.

Even vanilla models spew out POC for three RCE’s in less than an hour