← Back to context

Comment by anematode

11 hours ago

Ah, that's a great hypothesis. I wonder, then, how it works with x86 emulation on ARM. IIRC, atomic ops on ARM fault if the address isn't naturally aligned... but I guess the runtime could intercept that and handle it slowly.

ARM macs apparently have some kind of specific handling in place for this when a process is running with x86_64 compatibility, but it’s not publicly documented anywhere that I can see.

An emulated x86 atomic instruction wouldn’t need to use atomic instructions on ARM.

  • Why not?

    • They don’t have to match.

      As an example, what about a divide instruction. A machine without an FPU can emulate a machine that has one. It will legitimately have to run hundreds/thousands of instructions to emulate a single divide instruction, it will certainly take longer.

      Thats OK, just means the emulation is slower doing that than something like add that the host has a native instruction for. In ‘emulator time’ you still only ran one instruction. That world is still consistent.

      3 replies →