← Back to context

Comment by NitpickLawyer

6 hours ago

> These tools feel symmetric for defenders to use as well.

I don't think so. From a pure mathematical standpoint, you'd need better (or equal) results at avg@1 or maj@x, while the attacker needs just pass@x to succeed. That is, the red agent needs to work just once, while the blue agent needs to work all the time. Current agents are much better (20-30%) at pass@x than maj@x.

In real life that's why you sometimes see titles like "teenager hacks into multi-billion dollar company and installs crypto malware".

I do think that you're right in that we'll see improved security stance by using red v. blue agents "in a loop". But I also think that red has a mathematical advantage here.

>> These tools feel symmetric for defenders to use as well.

> I don't think so. From a pure mathematical standpoint, you'd need better (or equal) results at avg@1 or maj@x, while the attacker needs just pass@x to succeed.

Executing remote code is a choice not some sort of force of nature.

Timesharing systems are inherently not safe and way too much effort is put into claiming the stone from Sisyphus.

SaaS and complex centralized software need to go and that is way over due.

  • Awesome! What’s your strategy for migration of the entire world’s infrastructure to whatever you’re thinking about?

    • My strategy is to not use "the entire world's infrastructure" which makes it redundant.

      If enough people cancel their leftpad-as-a-Service subscription the server can be unplugged.

      (Yes I am somewhat hyperbolic and yes I see use for internet connected servers and clients. I argue against the SaaS driven centralization.)

      1 reply →