Comment by anematode

11 hours ago

Ah, that's a great hypothesis. I wonder, then, how it works with x86 emulation on ARM. IIRC, atomic ops on ARM fault if the address isn't naturally aligned... but I guess the runtime could intercept that and handle it slowly.

9 comments

anematode

omcnoe 9 hours ago

ARM macs apparently have some kind of specific handling in place for this when a process is running with x86_64 compatibility, but it’s not publicly documented anywhere that I can see.

my123 8 hours ago
XNU has this oddity: https://github.com/apple-oss-distributions/xnu/blob/f6217f89...
Redacted from open source XNU, but exists in the closed source version
- omcnoe 7 hours ago
  
  Is it actually redacted, or just a leftover stub from a feature implemented in silicon instead of software? Isn't the x86 memory order compatibility done at hardware level?

BobbyTables2 11 hours ago

An emulated x86 atomic instruction wouldn’t need to use atomic instructions on ARM.

dooglius 11 hours ago
Why not?
- MBCook 10 hours ago
  
  They don’t have to match.
  As an example, what about a divide instruction. A machine without an FPU can emulate a machine that has one. It will legitimately have to run hundreds/thousands of instructions to emulate a single divide instruction, it will certainly take longer.
  Thats OK, just means the emulation is slower doing that than something like add that the host has a native instruction for. In ‘emulator time’ you still only ran one instruction. That world is still consistent.
  
  3 replies →