← Back to context

Comment by otterley

4 years ago

That particular implementation seems inconsistent with the following requirement:

> The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes.

If I wrote that requirement in a classroom programming assignment and you presented me with that code, you'd get a failing grade. Similarly, if I were a product manager and put that in the spec and you submitted the above code, it wouldn't be merged.

> You are quoting the non-normative informative part

Indeed, I am! It is important. Context matters, both in law and in programming. As a legal analogy, if you study Supreme Court rulings, you will find that in addition to examining the text of legislation or regulatory rules, the court frequently looks to legislative history, including Congressional findings and statements by regulators and legislators in order to figure out how to best interpret the law - especially when the text is ambiguous.

> If I wrote that requirement in a classroom programming assignment and you presented me with that code, you'd get a failing grade.

It's a good thing operating systems aren't made up entirely of classroom programming assignments.

Picture an OS which always runs on fully-synchronized storage (perhaps a custom Linux or BSD or QNX kernel). If there's no write cache and all writes are synchronous, then fsync() doesn't need to do anything at all; therefore `int fsync(int) {return 0}` is valid because fsync()'s method is implementation-specific.

This allows you to have no software or hardware write cache and not implement fsync() and still be POSIX-compliant.

> Context matters, both in law and in programming. As a legal analogy, if you study Supreme Court rulings, you will find that in addition to examining the text of legislation or regulatory rules, the court frequently looks to legislative history, including Congressional findings and statements by regulators and legislators in order to figure out how to best interpret the law - especially when the text is ambiguous.

The POSIX specification is not a court of law, and the context is pretty clear: fsync() should do whatever it needs to do to request that pending writes are written to the storage device. In some valid cases, that could be nothing.

  • > Picture an OS which always runs on fully-synchronized storage (perhaps a custom Linux or BSD or QNX kernel). If there's no write cache and all writes are synchronous, then fsync() doesn't need to do anything at all; therefore `int fsync(int) {return 0}` is valid because fsync()'s method is implementation-specific.

    Sure, I'll give you that, in a corner case where all writes are synchronized to storage before completing. However, most modern computers cache writes for performance, and the speed/security tradeoff is the context of this discussion. We wouldn't be having this debate in the first place if computers and storage devices didn't cache writes.

    > The POSIX specification is not a court of law

    Indeed, it isn't; nor is legislative text (the closest analogy in law). Hence the need for interpretation.

    > fsync() should do whatever it needs to do to request that pending writes are written to the storage device

    We are in violent agreement about this :-)

    • The wording here is quite subtle. Without SIO, fsync is merely a request, returning an error if one occurred. As the informative section points out, this means that the request may be ignored, which is not an error.

      > If _POSIX_SYNCHRONIZED_IO is not defined, the wording relies heavily on the conformance document to tell the user what can be expected from the system. It is explicitly intended that a null implementation is permitted.

      Compare this to e.g. the wording for write(2):

      > The write() function shall attempt to write nbyte bytes from the buffer pointed to by buf to the file associated with the open file descriptor, fildes. [yadadada]

      This actually specifies that an action needs to be performed. fsync(2) sans SIO is merely a request form that the OS can respond to or not. And because macOS does not define SIO, you have to go out and find out what that particular implementation is actually doing and the answer is: essentially nothing for fsync.

      1 reply →

    • There’s also the very likely possibility that the storage is lying to the OS, that the data that was accepted and which is in the buffer has been written somewhere durable while it’s actually waiting for an erase to finish or a head to get wherever it needs to be. There are disk controllers with batteries precisely for those situations.

      And, if cheating will give better numbers on benchmarks, I’m willing to bet money most manufacturers will cheat.