Comment by bityard
3 years ago
There's no reason launching a container _must_ be slow. Under the hood, launching a containerized process is just making a few kernel syscalls that have very little overhead. You might be thinking of docker, which is slow because it's intended for certain use cases and brings a lot of baggage with it in the name of backward compatibility from a time when the Linux container ecosystem was much less mature.
There are several projects working on fast, slim containers (and even VMs) with completely negligible startup times.
I don't know what is holding back container/VM access to graphics hardware, but it can't be insurmountable if the cloud providers are doing it.
The problem with containers and graphics drivers is that those drivers have an userspace component. This depends on the hardware (e.g. AMD vs. Intel vs. NVidia all have different drivers of course) and in the case of NVidia this has to be exactly matched with the kernel version (this is less of an issue with VMs, but then you need something like SR-IOV which isn't quite on consumer HW, or do dedicated PCIE throughput which doesn't allow the host to use it).
So version management becomes a major pita, from shipping drivers too old to support the hardware to having a driver that doesn't match the kernel. In the cloud this is mostly solved by using VMs and hardware with SR-IOV. (and a fixed HW vendor so you know which set of drivers to include)
> I don't know what is holding back container/VM access to graphics hardware, but it can't be insurmountable if the cloud providers are doing it.
Cloud providers have graphics hardware with SR-IOV support. It is exactly the kind of functionality, that the GPU vendors use for segmentation of their more expensive gear.