>Starting with 4 virtual cores and 8 GB vRAM, where the VM ran perfectly briskly with around 5 GB of memory used, I stepped down to 3 cores and 6 GB, to discover that memory usage fell to 3.9 GB and everything worked well. With just 2 cores and 4 GB of memory only 3.1 GB of that was used, and the VM continued to handle those lightweight tasks normally.
Good reminder that there's a certain amount of memory tied up with each core (probably mainly page cache and concurrency handling etc).
As a general rule, also the amount of physical memory installed in a computer should be proportional with the number of hardware threads provided by its CPU.
Besides the fact that the operating system may allocate some memory for each thread, when you launch a multi-threaded application that is able to use all available threads, for instance the compilation of a big software project, it frequently will allocate some working memory in an amount proportional with the amount of working threads.
I have encountered many multi-threaded applications that need up to 2 GB per thread to work well.
This corresponds to having 64 GB for a desktop CPU with 32 threads, like Ryzen 9 9950X.
For the compilation example, I have seen software projects, like Chrome/Chromium and its derivatives, where if you do not have enough memory, proportional to the number of hardware threads, e.g. when you have only 32 GB for a 16 core/32 thread CPU, you must reduce the number of concurrent compilations, e.g. with an appropriate parameter to "make -j", leaving some threads and cores idle, because otherwise you may encounter out-of-memory errors.
Compiling flash-attn (Flash Attention) is a another great stress-test for CPU+RAM as just using 16 threads can balloon you into 128GB RAM usage territory already. Same thing with needing to not do too much concurrency when compiling it.
It's an important point. I went from 4c/8t and 32GB to 16/32 and 96GB. Dramatically less memory per thread. Some software (looking at you, Vivado) can take incredible amounts of memory per parallel job thus mandating some projects can only run with a subset of my cores. At least until I stepped up my work laptop to 10.66 GB/thread. That seems to be manageable
I'd bet for the null hypothesis: the memory behaviour changes would hold if the core count was kept constant and only the VM's memory size was adjusted.
Agreed. This is the OS adapting to available memory.
Similarly if you started with 4GB and there was 900MB available for user apps, I expect you could launch apps that consume 1500MB just fine; the OS is leaving enough to launch anything, and making use of unused memory for cache/etc.
There is some overhead per-core, you're right, but imo this reduction in usage is likely from how the kernel allocates available memory, which is being reduced as well. The kernel will keep read caches around longer with more memory, it'll prefer to compress memory instead of swap to disk if it has more, it'll purge/cleanup reclaimable memory less often with more memory, etc. It even scales its internal buffer sizes and vnode tables depending on total memory.
All good things imo, it dynamically makes the most of what is available, at the expense of making it harder to see a true baseline of hard min requirement to operate.
I need this too, and looked quite a lot on it a year ago. I haven’t had time to check out the recent developments with Docker Model Runner (vllm-metal) or podman libkrun. Did neither of those work for you?
vllm-metal isn't GPU access but rather a openai compatible end point which I can already do via lm studio endpoint over network
>podman libkrun
Haven't tried it but research suggests its really shaky still. podman libkrun exposes vulkan while torch expects mps on macs. Sounds like one can force vulkan but that's apparently slow and beta-ish?
Here's an example of how to build a simple Alpine Linux container using Apple's containerization CLI. It also demonstrates how to connect to the container through Tailscale SSH using a Tailscale auth key stored in Apple Keychain:
Does this project aim for docker cli and api compatibility? Searching for Docker on that page yields no results. Though in their example, they do show an example of a Dockerfile referencing docker.io without shame.
Typical Apple behavior, I guess, but grating to see in a OSS tool.
Recently got a Mac Mini for local CI purposes (together with Forgejo Actions), took a broad look at the ecosystem and decided to just roll with "build on host" instead. Setting up signing/notarization just looked like an insurmountably task together with isolating it from the host, even with agents. At least the macOS builds are really fast now and the signing/notarization just ~200 lines of Bash...
OrbStack is impressive on the performance and energy efficiency fronts. I'm not aware of anything that comes close. But they're doing something funky under the covers. You can't just start any OS in a VM. It has to be somehow mangled to suit their VM. Thankfully NixOS is available so I'm fine for my use cases. It's still remarkable how efficient it is.
macOS is generally pretty amazing at efficient memory usage and VM (virtual memory subsystem) handling. So even a 8GB machine can run pretty impressive workloads without having the user think the machine is underpowered.
Honestly macOS probably can go much lower than that if you turn off some stuff that's not strictly necessary for a VM. The first iPhones only had 128 MiB of RAM and they ran a trimmed down version of macOS Tiger I believe. It's just that RAM has been quite abundant so far, so there was no real reason to try to trim it down, but it's definitely possible, and probably not that hard either, we just need to start trying again :)
Apple's built-in virtualization framework. For macOS guests, tart is probably the best out there. Apple's own `container` CLI tool for linux/docker-like containers.
I was hoping to see the bare macOS with all the applications removed as much as possible, no graphical user interface, just the bare minimum to boot, login as a user, and write hello world dot txt with a text editor. Or maybe some command line apps? Or is it no longer macOS at that point?
Launch 1 True Recovery, open Terminal, then run “bputil -a” (without the quotes) to downgrade system security and allow for more boot arguments. You might need to restart after this step.
Then, run [nvram boot-args=”-s”] (without the square brackets). Restart to launch Single User Mode.
Once in Single User Mode, run these commands (in the following order) to mount the root volume group:
1. mount -P 1
2. /usr/libexec/init_data_protection
3. mount -P 2
Future restarts will always launch Single User Mode first. To stop launching Single User Mode, run [nvram boot-args=“”] (without the square brackets).
To restore your system to full security, run “bputil -f” (without the quotes). If you choose to run that command in macOS, prefix “sudo” to the beginning.
"I'd just like to interject for a moment. What you're referring to as macOS, is in fact, macOS/Darwin, or as I've recently taken to calling it, macOS plus Darwin."
"What you're referring to as Darwin, is in fact, Darwin/XNU."
"What you're referring to as XNU, is in fact, BSD/Mach."
I seem to remember it being possible to run macOS-less Darwin several years ago, not sure if that's still possible or if Apple has modified it so much at this point that it's useless without at least some macOS components.
I'm wondering if the Xcode simulator (without Xcode running) performs as well, my 2020 Intel MacBook Air has been incapable of running Safari in iOS smoothly for nearly all its life.
"We might hope that macOS would process AI tasks using the CPU and GPU rather than the neural engine, when running in a VM."
That specific Geekbench test is to measure the ANE performance, which they did by setting the CoreML run to cpuAndNeuralEngine. They could have set it to all and it would use any hardware available, but that would be counterproductive to a test that hopes to measure the ANE, no?
And note that there is no "just ANE" option. In this case it is probably the virtualized CPU side of the equation that's yielding the massive slowdowns for int8 and quantized runs.
>Starting with 4 virtual cores and 8 GB vRAM, where the VM ran perfectly briskly with around 5 GB of memory used, I stepped down to 3 cores and 6 GB, to discover that memory usage fell to 3.9 GB and everything worked well. With just 2 cores and 4 GB of memory only 3.1 GB of that was used, and the VM continued to handle those lightweight tasks normally.
Good reminder that there's a certain amount of memory tied up with each core (probably mainly page cache and concurrency handling etc).
As a general rule, also the amount of physical memory installed in a computer should be proportional with the number of hardware threads provided by its CPU.
Besides the fact that the operating system may allocate some memory for each thread, when you launch a multi-threaded application that is able to use all available threads, for instance the compilation of a big software project, it frequently will allocate some working memory in an amount proportional with the amount of working threads.
I have encountered many multi-threaded applications that need up to 2 GB per thread to work well.
This corresponds to having 64 GB for a desktop CPU with 32 threads, like Ryzen 9 9950X.
For the compilation example, I have seen software projects, like Chrome/Chromium and its derivatives, where if you do not have enough memory, proportional to the number of hardware threads, e.g. when you have only 32 GB for a 16 core/32 thread CPU, you must reduce the number of concurrent compilations, e.g. with an appropriate parameter to "make -j", leaving some threads and cores idle, because otherwise you may encounter out-of-memory errors.
Compiling flash-attn (Flash Attention) is a another great stress-test for CPU+RAM as just using 16 threads can balloon you into 128GB RAM usage territory already. Same thing with needing to not do too much concurrency when compiling it.
1 reply →
It's an important point. I went from 4c/8t and 32GB to 16/32 and 96GB. Dramatically less memory per thread. Some software (looking at you, Vivado) can take incredible amounts of memory per parallel job thus mandating some projects can only run with a subset of my cores. At least until I stepped up my work laptop to 10.66 GB/thread. That seems to be manageable
Yes! I have also observed that with compilation VMs on a big server.
I'd bet for the null hypothesis: the memory behaviour changes would hold if the core count was kept constant and only the VM's memory size was adjusted.
Agreed. This is the OS adapting to available memory.
Similarly if you started with 4GB and there was 900MB available for user apps, I expect you could launch apps that consume 1500MB just fine; the OS is leaving enough to launch anything, and making use of unused memory for cache/etc.
There is a per-cpu data structure in the xnu kernel, but it is not big enough to tilt the scales when you are talking about RAM in units of gigabytes.
1 reply →
There is some overhead per-core, you're right, but imo this reduction in usage is likely from how the kernel allocates available memory, which is being reduced as well. The kernel will keep read caches around longer with more memory, it'll prefer to compress memory instead of swap to disk if it has more, it'll purge/cleanup reclaimable memory less often with more memory, etc. It even scales its internal buffer sizes and vnode tables depending on total memory.
All good things imo, it dynamically makes the most of what is available, at the expense of making it harder to see a true baseline of hard min requirement to operate.
Fun things to check, `vm_stat`
$ vm_stat Mach Virtual Memory Statistics: (page size of 4096 bytes)
Pages free: 230295.
Pages active: 1206857.
Pages inactive: 1206361.
Pages speculative: 31863.
Pages throttled: 0.
Pages wired down: 470093.
Pages purgeable: 18894.
"Translation faults": 21635255.
Pages copy-on-write: 1590349.
Pages zero filled: 11093310.
Pages reactivated: 15580.
Pages purged: 50928.
File-backed pages: 689378.
Anonymous pages: 1755703.
Pages stored in compressor: 0.
Pages occupied by compressor: 0.
Decompressions: 0.
Compressions: 0.
Pageins: 832529.
Pageouts: 225.
Swapins: 0.
Swapouts: 0.
edit: no code fence markdown support or am I doing something wrong?
Single inline backticks like `this` aren't recognized (although still useful in my opinion, they just don't change the rendering).
Triple backticks also aren't recognized. However, if you indent by I believe 4 spaces, it formats it in a fixed width font presuming it's code.
Let's try (4 spaces):
None for comparison:
func main() { fmt.Println("Hello, HN!") }
1 reply →
Got a M5 air recently - my first dive into MacOS land so trying to figure this out too.
Seems essentially impossible to get:
* pytorch
* GPU acceleration
* VM/container like isolation
The virtio-gpu layer gets closest but seems to only pass through graphics GPU not compute GPU so no pytorch
I need this too, and looked quite a lot on it a year ago. I haven’t had time to check out the recent developments with Docker Model Runner (vllm-metal) or podman libkrun. Did neither of those work for you?
vllm-metal isn't GPU access but rather a openai compatible end point which I can already do via lm studio endpoint over network
>podman libkrun
Haven't tried it but research suggests its really shaky still. podman libkrun exposes vulkan while torch expects mps on macs. Sounds like one can force vulkan but that's apparently slow and beta-ish?
I got torch to run in a Cirruslabs Tart instance.
By "Instance" do you mean their cloud platform?
2 replies →
[dead]
My only experience with VMs on macOS is colima+docker, and it's relatively painful and inefficient (but usable).
Try Apple's container CLI. I moved a project of mine from colima+docker to it relatively easily, a couple of weekends ago.
https://github.com/apple/container
Here's an example of how to build a simple Alpine Linux container using Apple's containerization CLI. It also demonstrates how to connect to the container through Tailscale SSH using a Tailscale auth key stored in Apple Keychain:
https://github.com/highpost/tailscale-macos-container
Does this project aim for docker cli and api compatibility? Searching for Docker on that page yields no results. Though in their example, they do show an example of a Dockerfile referencing docker.io without shame.
Typical Apple behavior, I guess, but grating to see in a OSS tool.
2 replies →
I'm curious to know what kind of project is macOS exclusive?
1 reply →
container is really good, ive been using it to sandbox some CLI tools and it starts up in less than a second
AFAIK no support for Compose though
Thank you for this, will check it out!
Recently got a Mac Mini for local CI purposes (together with Forgejo Actions), took a broad look at the ecosystem and decided to just roll with "build on host" instead. Setting up signing/notarization just looked like an insurmountably task together with isolating it from the host, even with agents. At least the macOS builds are really fast now and the signing/notarization just ~200 lines of Bash...
> the signing/notarization just ~200 lines of Bash
200 lines?! That’s two orders of magnitude too many. What exactly are you doing that you need so such code for signing and notarisation?
4 replies →
Could you share your recipe please ? I’m interested
OrbStack is pretty good. I don't find it inefficient, really.
OrbStack is impressive on the performance and energy efficiency fronts. I'm not aware of anything that comes close. But they're doing something funky under the covers. You can't just start any OS in a VM. It has to be somehow mangled to suit their VM. Thankfully NixOS is available so I'm fine for my use cases. It's still remarkable how efficient it is.
2 replies →
https://github.com/trycua/cua/tree/main/libs/lume had a interesting take on this.
> Starting with 4 virtual cores and 8 GB vRAM, where the VM ran perfectly briskly with around 5 GB of memory used
But... if you start applications inside your VM it will want the full 8 Gb you've allocated not the 5 Gb it uses at startup?
I don’t assume that macOS virtualization is advanced enough to support memory ballooning, or is that not what you’re referring to?
Edit: I stand corrected!
I don't assume anything either, but a single Google search is enough to dispel that [1]
[1] https://developer.apple.com/documentation/virtualization/vzv...
macOS is generally pretty amazing at efficient memory usage and VM (virtual memory subsystem) handling. So even a 8GB machine can run pretty impressive workloads without having the user think the machine is underpowered.
14 replies →
What will that help with if the host and guest combined need > physical ram?
2 replies →
Honestly macOS probably can go much lower than that if you turn off some stuff that's not strictly necessary for a VM. The first iPhones only had 128 MiB of RAM and they ran a trimmed down version of macOS Tiger I believe. It's just that RAM has been quite abundant so far, so there was no real reason to try to trim it down, but it's definitely possible, and probably not that hard either, we just need to start trying again :)
Well early iPhones did not have app multitasking, so that‘s quite the difference. Any app was killed when when closed.
Yes it did. You just couldn’t use it. I could send a text message while listening to music. Sometimes the music would crash due to OOM.
Maybe I’m nitpicking but there is no such thing as “macOS Tiger”. It’s called Mac OS X at the time so it’s Mac OS X Tiger.
How do you VM it up? What tool do you use?
Apple's built-in virtualization framework. For macOS guests, tart is probably the best out there. Apple's own `container` CLI tool for linux/docker-like containers.
I think I got the smallest:
used to cross-build to darwin.
I was hoping to see the bare macOS with all the applications removed as much as possible, no graphical user interface, just the bare minimum to boot, login as a user, and write hello world dot txt with a text editor. Or maybe some command line apps? Or is it no longer macOS at that point?
You can boot regular macOS directly to a root terminal in “Single User Mode”. This was easier on Intel macs of yore but is also possible on M1+
Below content from https://eclecticlight.co/2020/11/28/startup-modes-for-m1-mac...
Launch 1 True Recovery, open Terminal, then run “bputil -a” (without the quotes) to downgrade system security and allow for more boot arguments. You might need to restart after this step.
Then, run [nvram boot-args=”-s”] (without the square brackets). Restart to launch Single User Mode.
Once in Single User Mode, run these commands (in the following order) to mount the root volume group:
1. mount -P 1
2. /usr/libexec/init_data_protection
3. mount -P 2
Future restarts will always launch Single User Mode first. To stop launching Single User Mode, run [nvram boot-args=“”] (without the square brackets).
To restore your system to full security, run “bputil -f” (without the quotes). If you choose to run that command in macOS, prefix “sudo” to the beginning.
"I'd just like to interject for a moment. What you're referring to as macOS, is in fact, macOS/Darwin, or as I've recently taken to calling it, macOS plus Darwin."
"What you're referring to as Darwin, is in fact, Darwin/XNU."
"What you're referring to as XNU, is in fact, BSD/Mach."
I seem to remember it being possible to run macOS-less Darwin several years ago, not sure if that's still possible or if Apple has modified it so much at this point that it's useless without at least some macOS components.
> several years ago
2024, maybe? needs some renewed interest perhaps:
https://www.puredarwin.org/
1 reply →
https://github.com/apple/darwin-xnu
Apple stopped updating this 5 years ago.
I remember getting it to boot once long ago but I didn't have anything to actually do with it.
2 replies →
Kind of a random question, but would it be feasible to intune enroll a macOS VM as a personal device?
Maybe, but then likely only as a BYOD. A company owned enrollment setup requires linking up with Apple Business Manager.
Is is possible to run macos on pc? Or at least dev in some way on PC for the mac.
It's called a Hackintosh; there's plenty of information on that.
You can boot into macOS with QEMU, but you won't have hardware-accelerated graphics or a handful of other features.
Which features? Apple Pay?
2 replies →
I am so curious why no one make an env for agent specfic for macOS. Like the agent spawn in mac env
I'm wondering if the Xcode simulator (without Xcode running) performs as well, my 2020 Intel MacBook Air has been incapable of running Safari in iOS smoothly for nearly all its life.
Macbook Neo should run rings around any Intel Air: Geekbench shows it at 250% the score of 2020 Intel Air.
https://browser.geekbench.com/v6/cpu/compare/17022784?baseli...
My M1 Air, which was my personal Mac, generally stomped my work MBP 2019 with an Intel chip.
The difference between the absolutely silent M1 and the hairdryer Intel was staggering.
I’m sure you’re completely right.
You’re going to love that newfangled M1 chip.
"We might hope that macOS would process AI tasks using the CPU and GPU rather than the neural engine, when running in a VM."
That specific Geekbench test is to measure the ANE performance, which they did by setting the CoreML run to cpuAndNeuralEngine. They could have set it to all and it would use any hardware available, but that would be counterproductive to a test that hopes to measure the ANE, no?
And note that there is no "just ANE" option. In this case it is probably the virtualized CPU side of the equation that's yielding the massive slowdowns for int8 and quantized runs.
The ANE isn't the problem here.
https://dennisforbes.ca/blog/microblog/2026/02/apple-neural-...
[dead]
[dead]
[dead]
[dead]