← Back to context

Comment by fexecve

3 years ago

How does it do this? Does it cache changes and only write them every so often? Does it keep a file open to write uncommitted changes?

The answer is complex, but boils down to "bad testing practices", or "measuring the wrong thing".

To maybe expand on it a bit: it's not known how the author created, configured and mounted the filesystem they were using, neither make nor version of filesystem is given. The author of the test doesn't even know that any I/O test worth its salt needs to "warm up" the system for the results to be reliable. Instead, they run the test multiple times and average them.

To give examples of things that may influence the speed of such tests:

* Is the filesystem mounted with atime option? If it is, it will generate more I/O as open() now generates writes beside reads. This is just one example of mount options affecting the test.

* How is the filesystem configured to store directory info, sometimes it's possible to embed this info in the inode, other times it's a linked-list-like structure that will potentially generate more I/O requests. This is, again, but a single example of a category of factors that have to be controlled for.

* Whether filesystem supports journaling, snapshots, deduplication, compression... whether it's parallel, how big it is / how fragmented it is... how much memory is available for caching, how is system I/O merger configured? And the list of questions not covered by the authors of the test goes on.

The explanation the authors themselves came up with for the results they see is this: the way they designed filesystem tests, they call open() and close() a lot, and they don't do that in their database tests. But, open() and close() don't have a fixed "price". Their performance will depend on many factors listed above and the options given to open().

From what I can tell, the I/O is performed in a blocking way, in a single thread, which is the worst way to perform I/O if you want good performance. So, in this test, both SQLite and filesystem suck, and, if you wanted to make them go faster, then you definitely could. Especially, you could improve the filesystem case.