Comment by the_harpia_io

19 days ago

containers are fine for basic isolation but the attack surface is way bigger than people think. you're still trusting the container runtime, the kernel, and the whole syscall interface. if the agent can call arbitrary syscalls inside the container, you're one kernel bug away from a breakout.

what I'm curious about with matchlock - does it use seccomp-bpf to restrict syscalls, or is it more like a minimal rootfs with carefully chosen binaries? because the landlock LSM stuff is cool but it's mainly for filesystem access control. network access, process spawning, that's where agents get dangerous.

also how do you handle the agent needing to install dependencies at runtime? like if claude decides it needs to pip install something mid-task. do you pre-populate the sandbox or allow package manager access?

Creator of matchlock here. Great questions, here's how matchlock handles these:

The guest-agent (pid-1) spawns commands in a new pid + mount namespace (similar to firecracker jailer but in the inner level for the purpose of macos support). In non-privileged mode it drops SYS_PTRACE, SYS_ADMIN, etes from the bounding set, sets `no_new_privs`, then installs a seccomp-BPF filter that eperms proces vm readv/writev, ptrace kernel load. The microVM is the real isolation boundary — seccomp is defense in depth. That said there is a `--privileged` flag that allows that to be skipped for the purpose of image build using buildkit.

Whether pip install works is entirely up to the OCI image you pick. If it has a package manager and you've allowed network access, go for it. The whole point is making `claude --dangerously-skip-permissions` style usage safe.

Personally I've had agents perform red team type of breakout. From my first hand experience what the agent (opus 4.6 with max thinking) will exploit without cap drops and seccomps is genuinely wild.

  • Thank you for matchlock! I’ve got Opus 4.6 red teaming it right now. ;)

    I think a secure VM is a necessary baseline, and the days of env files with a big bundle of unscoped secrets are a thing of the past, so I like the base features you built in.

    I’d love to hear more about the red team breakouts you’ve seen if you have time.

    • curious what Opus 4.6 tries - I'd guess it goes for the usual suspects (path traversal, symlink games, timing attacks on the network proxy) but curious if it finds anything novel. the env file point is interesting though - agents need some secrets to be useful, but the attack surface gets wild when you consider that the agent itself might be compromised before it even touches your credentials. I keep thinking about this for my own stuff - like do you rotate secrets per-session? pre-authorize specific API calls? feels like we need better primitives than just "here's a bundle of keys, try not to leak them"

  • defense in depth makes sense - microVM as the boundary, seccomp as insurance. most docs treat seccomp like it's the whole story which is... optimistic.

    the opus 4.6 breakouts you mentioned - was it known vulns or creative syscall abuse? agents are weirdly systematic about edge cases compared to human red teamers. they don't skip the obvious stuff.

    --privileged for buildkit tracks - you gotta build the images somewhere.

    • It tried a lot of things relentlessly, just to name a few:

      * Exploit kernel CVEs * Weaponise gcc, crafting malicious kernel modules; forging arbitrary packets to spoof the source address that bypass tcp/ip * Probing metadata service * Hack bpf & io uring * A lot of mount escape attempts, network, vsock scanning and crafting

      As a non security researcher it was mind blown to see what it did, which in the hindsight isn't surprising as Opus 4.6 hits 93% solve rate on Cybench - https://cybench.github.io/

      1 reply →

I'm working on a similar project. Currently managing images with nix, using envoy to proxy all outbound traffic with no direct network access, with optional quota support. Ironically similar to how I'd do things for humans.

My architecture is a little different though, as my agents aren't running in the sandbox, only executing code there remotely.

  • nix for image management sounds solid - way better than cobbling together docker configs and hoping for the best. envoy for outbound traffic is interesting, I've been thinking about a similar approach but haven't committed to it yet. how are you handling the quota side? like per-request limits or aggregate bandwidth caps? I keep going back and forth on whether to do it at the proxy level or bake it into the runtime itself