← Back to context

Comment by lumpypua

10 years ago

Functional versus imperative concurrent shared data approaches provide a good analogy:

* Single file + log: fine grained locking in a shared C data structure. Yuck!

* Write new then move: transactional reference to a shared storage location, something like Clojure's refs. Easy enough.

The latter clearly provides the properties we'd like, the former may but it's a lot more complicated to verify and there are tons of corner cases. So I think move new file over old file is the simpler strategy and way easier to reason about.

The obvious downside is that this temporarily uses twice the size of the dataset. However, that is usually mitigated by splitting the data into multiple files, and/or applying this only to applications that don't need to store gigabytes in the first place.

Clojure's approach again provides an interesting solution to saving space. Taking the idea of splitting data into multiple files to the logical conclusion, you end up with the structure sharing used in efficient immutable data structures.