This kind of response isn't helpful. He's right to ask about the motivations for the claim that containers in general are "not a sandbox" when the design of containers/namespaces/etc. looks like it should support using these things to make a sandbox. He's right to be confused!
If you look at the interface contract, both containers and VMs ought to be about equally secure! Nobody is an idiot for reading about the two concepts and arriving at this conclusion.
What you should have written is something about your belief that the inter-container, intra-kernel attacker surface is larger than the intra-hypervisor, inter-kernel attack surface and so it's less likely that someone will screw up implementing a hypervisor so as to open a security hole. I wouldn't agree with this position, but it would at least be defensible.
Instead, you pulled out the tired old "education yourself" trope. You compounded the error with the weasely "are considered" passive-voice construction that lets you present the superior security of VMs as a law of nature instead of your personal opinion.
In general, there's a lot of alpha in questioning supposedly established "facts" presented this way.
> This is not a weakness in the design of containers.
Partially correct.
Many container escapes are also because the security of the underlying host, container runtime, or container itself was poorly or inconsistently implemented. This creates gaps that allow escapes from the container. There is a much larger potential for mistakes, creating a much larger attack surface. This is in addition to kernel vulnerabilities.
While you can implement effective hardening across all the layers, the potential for misconfiguration is still there, therefore there is still a large attack surface.
While a virtual host can be escaped from, the attack surface is much smaller, leaving less room for potential escapes.
This is why containers are considered riskier for a sandbox than a virtual host. Which one you use, and why, really should depend on your use case and threat model.
Sad to say it, a disappointing amount of people don't put much hardening into their container environments, including production k8s clusters. So it's much easier to say that a virtual host is better for sandboxing than containers, because many people are less likely to get it wrong.
Escaping a properly set up container is a kernel 0day. Due to how large the kernel attack surface is, such 0days are generally believed to exist. Unless you are a high value target, a container sandbox will likely be sufficient for your needs. If cloud service providers discounted this possibility then a 0day could be burned to attack them at scale.
Also, you can use the runsc (gvisor) runtime for docker, if you are careful not to expose vulnerable protocols to the container there will be nothing escaping it with that runtime.
The last two are included to be complete, but in the case of the original article running untrusted python code makes them irrelevant in this circumstance.
My point you must consider the system as a whole to consider its overall attack surface and risk of compromise. There is a lot more that can go wrong to enable a container escape than you implied.
There are some people who are knowledgeable enough to ensure their containers are hardened at every level of the attack surface. Even then, how many are diligent enough to ensure that attention to detail every time? how many automate their configurations?
Most default configurations are not hardened as a compromise to enable usability. Most people who build containers do not consider hardening every possible attack surface. Many don't even know the basics. Most companies don't do a good job hardening their shared container environments - often as a compromise to be "faster".
So yeah, a properly set up container is hard to escape.
Not all containers are set up properly - I'd argue most are not.
> Escaping a properly set up container is a kernel 0day.
Not it is not. In fact many of the container escapes we see are because of bugs in the container runtimes themselves which can be quite different in their various implementations. CVE-2025-31133 was published 2? months ago and had nothing at all do with the kernel - just like many container escapes don't.
Note this lists 3 vulnerabilities as an example: CVE-2016-5195 (Dirty COW), CVE-2019-5736 (host runc override) and CVE-2022-0185 (io_uring escape)
Out of those, only first one is actually exploitable in common setups.
CVE-2019-5736 requires either attacker-controlled image or "docker exec". This is not likely to be the case in the "untrusted python" use case, nor in many docker setups.
CVE-2022-0185 is blocked by seccomp filter in default installs, so as long as you don't give your containers --privileged flags, you are OK. (And if you do give this flag, the escape is trivial without any vulnerabilities)
Exploit the Linux kernel underneath it (not the only way, just the obvious one). Docker is a security boundary but it is not suitable for "I'm running arbitrary code".
That is to say, Docker is typically a security win because you get things like seccomp and user/DAC isolation "for free". That's great. That's a win. Typically exploitation requires a way to get execution in the environment plus a privilege escalation. The combination of those two things may be considered sufficient.
It is not sufficient for "I'm explicitly giving an attacker execution rights in this environment" because you remove the cost of "get execution in the environment" and the full burden is on the kernel, which is not very expensive to exploit.
> Exploit the Linux kernel underneath it (not the only way, just the obvious one). Docker is a security boundary but it is not suitable for "I'm running arbitrary code".
Dockler is better for running arbitrary code compared to the direct `npm install <random-package>` that's common these days.
I moved to a Dockerized sandbox[1], and I feel much better now against such malicious packages.
It's better than nothing, obviously. But I don't consider `npm install <random-package>` to be equivalent to "RCE as a service", although it's somewhat close. I definitely wouldn't recommend `npm install <actually a random package>`, even in Docker.
I also implemented `insanitybit/cargo-sandbox` using Docker but that doesn't mean I think `insanitybit/cargo-sandbox` is a sufficient barrier to arbitrary code execution, which is why I also had a hardened `cargo add` that looked for typosquatting of package names, and why I think package manager security in general needs to be improved.
You can and should feel better about running commands like that in a container, as I said - seccomp and DAC are security boundaries. I wouldn't say "you should feel good enough to run an open SSH server and publish it for anyone to use".
This is a well understood and well documented subject. Do your own research.
Start here to help give you ideas for what to research:
https://linuxsecurity.com/features/what-is-a-container-escap...
This kind of response isn't helpful. He's right to ask about the motivations for the claim that containers in general are "not a sandbox" when the design of containers/namespaces/etc. looks like it should support using these things to make a sandbox. He's right to be confused!
If you look at the interface contract, both containers and VMs ought to be about equally secure! Nobody is an idiot for reading about the two concepts and arriving at this conclusion.
What you should have written is something about your belief that the inter-container, intra-kernel attacker surface is larger than the intra-hypervisor, inter-kernel attack surface and so it's less likely that someone will screw up implementing a hypervisor so as to open a security hole. I wouldn't agree with this position, but it would at least be defensible.
Instead, you pulled out the tired old "education yourself" trope. You compounded the error with the weasely "are considered" passive-voice construction that lets you present the superior security of VMs as a law of nature instead of your personal opinion.
In general, there's a lot of alpha in questioning supposedly established "facts" presented this way.
> This is a well understood and well documented subject. Do your own research.
Anything including GNU/Linux kernel can be broken with such security vulnerabilities.
This is not a weakness in the design of containers. `npm install`, on the other hand, is broken by design (due to post-install.
> This is not a weakness in the design of containers.
Partially correct.
Many container escapes are also because the security of the underlying host, container runtime, or container itself was poorly or inconsistently implemented. This creates gaps that allow escapes from the container. There is a much larger potential for mistakes, creating a much larger attack surface. This is in addition to kernel vulnerabilities.
While you can implement effective hardening across all the layers, the potential for misconfiguration is still there, therefore there is still a large attack surface.
While a virtual host can be escaped from, the attack surface is much smaller, leaving less room for potential escapes.
This is why containers are considered riskier for a sandbox than a virtual host. Which one you use, and why, really should depend on your use case and threat model.
Sad to say it, a disappointing amount of people don't put much hardening into their container environments, including production k8s clusters. So it's much easier to say that a virtual host is better for sandboxing than containers, because many people are less likely to get it wrong.
2 replies →
Escaping a properly set up container is a kernel 0day. Due to how large the kernel attack surface is, such 0days are generally believed to exist. Unless you are a high value target, a container sandbox will likely be sufficient for your needs. If cloud service providers discounted this possibility then a 0day could be burned to attack them at scale.
Also, you can use the runsc (gvisor) runtime for docker, if you are careful not to expose vulnerable protocols to the container there will be nothing escaping it with that runtime.
You start with the assumption of "properly set up container". Also I believe you are oversimplifying the attack surface.
A container escape can be caused by combinations of breakdowns in several layers:
- Kernel implementation - aka, a bug. It's rare, but it happens
- Kernel compile time options selected - This has become more rare, but it can happen
- Host OS misconfiguration - Can be a contributing factor to enabling escapes
- Container runtime vulnerability - A vulnerability in the runtime itself
- Container runtime misconfiguration - Was the runtime configured properly?
- Individual container runtime misconfiguration - Was the individual container configured to run securely?
- Individual Container build - what's in the container, and can be leveraged to attack the host
- Running container attack surface - What's the running container's attack surface
The last two are included to be complete, but in the case of the original article running untrusted python code makes them irrelevant in this circumstance.
My point you must consider the system as a whole to consider its overall attack surface and risk of compromise. There is a lot more that can go wrong to enable a container escape than you implied.
There are some people who are knowledgeable enough to ensure their containers are hardened at every level of the attack surface. Even then, how many are diligent enough to ensure that attention to detail every time? how many automate their configurations?
Most default configurations are not hardened as a compromise to enable usability. Most people who build containers do not consider hardening every possible attack surface. Many don't even know the basics. Most companies don't do a good job hardening their shared container environments - often as a compromise to be "faster".
So yeah, a properly set up container is hard to escape.
Not all containers are set up properly - I'd argue most are not.
> Escaping a properly set up container is a kernel 0day.
Not it is not. In fact many of the container escapes we see are because of bugs in the container runtimes themselves which can be quite different in their various implementations. CVE-2025-31133 was published 2? months ago and had nothing at all do with the kernel - just like many container escapes don't.
2 replies →
Note this lists 3 vulnerabilities as an example: CVE-2016-5195 (Dirty COW), CVE-2019-5736 (host runc override) and CVE-2022-0185 (io_uring escape)
Out of those, only first one is actually exploitable in common setups.
CVE-2019-5736 requires either attacker-controlled image or "docker exec". This is not likely to be the case in the "untrusted python" use case, nor in many docker setups.
CVE-2022-0185 is blocked by seccomp filter in default installs, so as long as you don't give your containers --privileged flags, you are OK. (And if you do give this flag, the escape is trivial without any vulnerabilities)
The burden of proof lies with the person making empirically unfalsifiable claims.
Exploit the Linux kernel underneath it (not the only way, just the obvious one). Docker is a security boundary but it is not suitable for "I'm running arbitrary code".
That is to say, Docker is typically a security win because you get things like seccomp and user/DAC isolation "for free". That's great. That's a win. Typically exploitation requires a way to get execution in the environment plus a privilege escalation. The combination of those two things may be considered sufficient.
It is not sufficient for "I'm explicitly giving an attacker execution rights in this environment" because you remove the cost of "get execution in the environment" and the full burden is on the kernel, which is not very expensive to exploit.
> Exploit the Linux kernel underneath it (not the only way, just the obvious one). Docker is a security boundary but it is not suitable for "I'm running arbitrary code".
Dockler is better for running arbitrary code compared to the direct `npm install <random-package>` that's common these days.
I moved to a Dockerized sandbox[1], and I feel much better now against such malicious packages.
It's better than nothing, obviously. But I don't consider `npm install <random-package>` to be equivalent to "RCE as a service", although it's somewhat close. I definitely wouldn't recommend `npm install <actually a random package>`, even in Docker.
I also implemented `insanitybit/cargo-sandbox` using Docker but that doesn't mean I think `insanitybit/cargo-sandbox` is a sufficient barrier to arbitrary code execution, which is why I also had a hardened `cargo add` that looked for typosquatting of package names, and why I think package manager security in general needs to be improved.
You can and should feel better about running commands like that in a container, as I said - seccomp and DAC are security boundaries. I wouldn't say "you should feel good enough to run an open SSH server and publish it for anyone to use".
4 replies →