Comment by 12_throw_away
7 days ago
FWIW, here are the repos for the CLI tool [1] and backend [2]. Looks like it is indeed VM-based container support (as opposed to WSLv1-style syscall translation or whatever):
Containerization provides APIs to:
[...]
- Create an optimized Linux kernel for fast boot times.
- Spawn lightweight virtual machines.
- Manage the runtime environment of virtual machines.
[1] https://github.com/apple/container [2] https://github.com/apple/containerization
I'm kinda ignorant about the current state of Linux VMs, but my biggest gripe with VMs is that OS kernels kind of assume they have access to all the RAM the hardware has - unlike the reserve/commit scheme processes use for memory.
Is there a VM technology that can make Linux aware that it's running in a VM, and be able to hand back the memory it uses to the host OS?
Or maybe could Apple patch the kernel to do exactly this?
Running Docker in a VM always has been quite painful on Mac due to the excess amount of memory it uses, and Macs not really having a lot of RAM.
It's still a problem for containers-in-VMs. You can in theory do something with either memory ballooning or (more modern) memory hotplugging, but the dance between the OS and the hypervisor takes a relatively long time to complete, and Linux just doesn't handle it well (eg. it inevitably places unmovable pages into newly reserved memory, meaning it can never be unplugged). We never found a good way to make applications running inside the VM able to transparently allocate memory. You can overprovision memory, and hypervisors won't actually allocate it on the host, and that's the best you can do, but this also has problems since Linux tends to allocate a bunch of fixed data structures proportional to the size of memory it thinks it has available.
That's called memory balooning and is supported by KVM on Linux. Proxmox for example can do that. It does need support on both the host and the guest.
it's not as straightforward a solution as it sounds, though
> Is there a VM technology that can make Linux aware that it's running in a VM, and be able to hand back the memory it uses to the host OS?
Isn't this an issue of the hypervisor? The guest OS is just told it has X amount of memory available, whether this memory exists or not (hence why you can overallocate memory for VMs), whether the hypervisor will allocate the entire amount or just what the guest OS is actually using should depend on the hypervisor itself.
> or just what the guest OS is actually using should depend on the hypervisor itself.
How can the hypervisor know which memory the guest OS is actually using? It might have used some memory in the past and now no longer needs it, but from the POV of the hypervisor it might as well be used.
This is a communication problem between hypervisor and guest OS, because the hypervisor manages the physical memory but only the guest OS known how much memory should actually be used.
3 replies →
Just looked it up - and the answer is 'baloon drivers', which are special drivers loaded by the guest OS, which can request and return unused pages to the host hypervisor.
Apparently docker for Mac and Windows uses these, but in practice, docker containers tend to grow quite large in terms of memory, so not quite sure how well it works in practice, its certainly overallocates compared to running docker natively on a Linux host.
The short answer is yes, Linux can be informed to some extent but often you still want a memory balloon driver so that the host can “allocate” memory out of the VM so the host OS can reclaim that memory. It’s not entirely trivial but the tools exist, and it’s usually not too bad on vz these days when properly configured.
It’s one reason i don’t like WSL2. When you compile something which needs 30 GB RAM the only thing you can do is terminate the wsl2 vm to get that ram back.
Since late 2023, WSL2 has supported "autoMemoryReclaim", nominally still experimental, but works fine for me.
add:
[experimental] autoMemoryReclaim=gradual
to your .wslconfig
See: https://learn.microsoft.com/en-us/windows/wsl/wsl-config
I just noticed the addition of container cask when I ran b”brew update”.
I chased the package’s source and indeed it’s pointing to this repo.
You can install and use it now on the latest macOS (not 26). I just ran “container run nginx” and it worked alright it seems. Haven’t looked deeper yet.
There’s some problem with networking: if you try to run multiple containers, they won’t see each other. Could probably be solved by running a local VPN or something.
WSLv1 never supported a native docker (AFAIK, perhaps I'm wrong?)
That said, I'd think apple would actually be much better positioned to try the WSL1 approach. I'd assume apple OS is a lot closer to linux than windows is.
This doesn't look like WSL1. They're not running Linux syscalls to the macOS kernel, but running Linux in a VM, more like the WSL2[0] approach.
[0] https://devblogs.microsoft.com/commandline/announcing-wsl-2/...
In the end they're probably run into the same issues that killed WSL1 for Microsoft— the Linux kernel has enormous surface area, and lots of pretty subtle behaviour, particularly around the stuff that is most critical for containers, like cgroups and user namespaces. There isn't an externally usable test suite that could be used to validate Microsoft's implementation of all these interfaces, because... well, why would there be?
Maintaining a working duplicate of the kernel-userspace interface is a monumental and thankless task, and especially hard to justify when the work has already been done many times over to implement the hardware-kernel interface, and there's literally Hyper-V already built into the OS.
Yeah, it probably would be feasible to dust off the FreeBSD Linux compatibility layer[1] and turn that into native support for Linux apps on Mac.
I think Apple’s main hesitation would be that the Linux userland is all GPL.
[1]: https://docs.freebsd.org/en/books/handbook/linuxemu/
If they built as a kernel extension it would probably be okay with gpl.
There’s a huge opportunity for Apple to make kernel development for xnu way better.
Tooling right now is a disaster — very difficult to build a kernel and test it (eg in UTM, etc.).
If they made this better and took more of an OSS openness posture like Microsoft, a lot of incredible things could be built for macOS.
I’ll bet a lot of folks would even port massive parts of the kernel to rust for them for free.
1 reply →