Comment by monocasa
4 years ago
On at least one drive I saw, the flush command was instead interpreted as a barrier to commands being committed to the log in controller DRAM, which could cut into parallelization, and therefore throughput, looking like a latency spike but not a flush out of the cache.
My test is single threaded, and thus has no parallelism to begin with.
The drive controller is internally parallel. The write is just a job queue submission, so the next write hits while it's still processing previous requests.
People have tested this stuff on storage devices with torture tests. Can you point at an example of a modern (directly attached) NVMe drive from a reputable vendor that cheats at this?
FWIW, macOS also has F_BARRIERFSYNC, which is still much slower than full syncs on the competition.