Comment by marcan_42

4 years ago

It isn't, because otherwise it would be showing the ~same performance with and without sync commands, as I showed in the thread. There is a significant performance loss for every drive, but Apple's is way worse.

There is no real excuse for a single sector write to take ~20ms to flush to NAND, all the while the NAND controller is generating some 10MB/s of DRAM traffic. This is a dumb firmware design issue.

9 comments

marcan_42

supermatt 4 years ago

It may be interpreting it differently. You arent comparing apples to apples, quite literally.

Why not compare macOS and linux on approved x86 mac hardware. i.e. fusion drive or whatever.

Also, as suggested - try F_BARRIERFSYNC, which flushes anything before the barrier (used for WAL IIRC).

marcan_42 4 years ago
This affects T2 Macs too, which use the same NVMe controller design as M1 Macs.
We've looked at NVMe command traces from running macOS under a transparent hypervisor. We've issued NVMe commands outside of Linux from a bare-metal environment. The 20ms flush penalty is there for Apple's NVMe implementation. It's not some OS thing. And other drives don't have it. And I checked and Apple's NVMe controller is doing 10MB/s of DRAM memory traffic when issued flushes, for some reason (yes, we can get those stats). And we know macOS does not properly flush with just fsync() because it actively loses data on hard shutdowns. We've been fighting this issue for a while now, it's just that it only just hit us yesterday/today that there is no magic in macOS - it just doesn't flush, and doesn't guarantee data persistence, on fsync().
- supermatt 4 years ago
  
  Ive just been scanning through linux kernel code (inc ext4). Are you sure that its not issuing a PREFLUSH? What are your barrier options on the mount? I think you will find these are going to be more like F_BARRIERFSYNC.
  I couldnt find much info about it - but the official docs are here: https://kernel.org/doc/html/v5.17-rc3/block/writeback_cache_...
  
  5 replies →
throwawaylinux 4 years ago

It seems to be pretty apples to apples, they're running the same benchmark using equivalent data storage APIs on both systems. What are you thinking might be different? The Linux+WD drive isn't making the data durable? Or that OSX does something stupid which could be the cause of the slowdown rather than the drive? Both seem implausible.