Comment by cryptonector

4 years ago

How are you reading POSIX as "saying no"??

From that page:

  The fsync() function shall request that all data for
  the open file descriptor named by fildes is to be
  transferred to the storage device associated with the
  file described by fildes. The nature of the transfer
  is implementation-defined. The fsync() function shall
  not return until the system has completed that action
  or until an error is detected.

then:

  The fsync() function is intended to force a physical
  write of data from the buffer cache, and to assure
  that after a system crash or other failure that all
  data up to the time of the fsync() call is recorded
  on the disk. Since the concepts of "buffer cache",
  "system crash", "physical write", and "non-volatile
  storage" are not defined here, the wording has to be
  more abstract.

The only reason to doubt the clarity of the above is that POSIX does not consider crashes and power failures to be in scope. It says so right in the quoted text.

Crashes and power failures are just not part of the POSIX worldview, so in POSIX there can be no need for sync(2) or fsync(2), or fcntl(2) w/ F_FULLFSYNC! Why even bother having those system calls? Why even bother having the spec refer to the concept at all?

Well, the reality is that some allowance must be made for crashes and power failures, and that includes some mechanism for flushing caches all the way to persistent storage. POSIX is a standard that some real-life operating systems aim to meet, but those operating systems have to deal with crashes and power failures because those things happen in real life, and because their users want the operating systems to handle those events as gracefully as possible. Some data loss is always inescapable, but data corruption would be very bad, which is why filesystems and applications try to do things like write-ahead logging and so on.

That is why sync(2), fsync(2), fdatasync(2), and F_FULLFSYNC exist. It's why they [well, some of them] existed in Unix, it's why they still exist in Unix derivatives, it's why they exist in Unix-alike systems, it's why they exist in Windows and other not-remotely-POSIX operating systems, and it's why they exist in POSIX.

If they must exist in POSIX, then we should read the quoted and linked page, and it is pretty clear: "transferred to the storage device" and "intended to force a physical write" can only mean... what that says.

It would be fairly outrageous for an operating system to say that since crashes and power failures are outside the scope of POSIX, the operating system will not provide any way to save data persistently other than to shut down!

2 comments

cryptonector

supermatt 4 years ago

> transferred to the storage device

MacOS does that.

> the fsync() function is intended to force a physical write of data from the buffer cache

If they define _POSIX_SYNCHRONIZED_IO, which they dont.

fsync wasnt defined as requiring a flush until version 5 of the spec. It was implemented in BSDs loooong before then. Apple introduced F_FULLFSYNC prior to fsync having that new definition.

I dont disagree with you, but it is what it is. History is a thing. Legacy support is a thing. Apple likely didnt want to change peoples expectations of the behaviour on OSX - they have their own implementation after all (which is well documented, lots of portable software and libs actively uses it, and its built in to the higher level APIs that Mac devs consume).

cryptonector 4 years ago

> > transferred to the storage device
> MacOS does that.
Depends on the definition of "storage device", I guess. If it's physical media, then OS X doesn't. If it's the controller, then OS X does. But since the intent is to have the data reach persistent storage, it has to be the physical media.
My guess is that since people know all of this, they'll just keep working around it as they already do. Newbies to OS X development will get bitten unless they know what to look for.