Comment by derefr
5 years ago
Right. The problem CRDTs solve is the problem of the three-way merge conflict in git: the problem of the "correct" merge being underspecified by the formalism, and so implementation dependent.
If two different git clients each implemented some automated form of merge-conflict resolution; and then each of them tried to resolve the same conflicting merge; then each client might resolve the conflict in a different, implementation-dependent way, resulting in differing commits. (This is already what happens even without automation—the "implementation" being depended upon is the set of manual case-by-case choices made by each human.)
CRDTs are data structures that explicitly specify, in the definition of what a conforming implementation would look like, how "merge conflicts" for the data should be resolved. (Really, they specify their way around the data ever coming into conflict — thus "conflict-free" — but it's easier to talk about them resolving conflicts.)
In the git analogy, you could think of a CRDT as a pair of "data-format aware" algorithms: a merge algorithm, and a pre-commit validation algorithm. The git client would, upon commit, run the pre-commit validation algorithm specific to the file's type, and only actually accept the commit if the modified file remained "mergeable." The client would then, upon merge, hand two of these files to a file-type-specific merge algorithm, which would be guaranteed to succeed assuming both inputs are "mergeable." Which they are, because we only let "mergeable" files into commits.
Such a framework, by itself, doesn't guarantee that anything good or useful will come out the other end of the process. Garbage In, Garbage Out. What it does guarantee, is that clients doing the same merge, will deterministically generate the same resulting commit. It's up to the designer of each CRDT data-structure to specify a useful merge algorithm for it; and it's up to the developer to define their data in terms of a CRDT data-structure that has the right semantics.
That just sparked a thought.
For a codebase, unit tests could be the pre-commit validation algorithm. Then, as authors continue to edit the piece, they both add unit tests, and merge the code. In the face of a merge, the tests could be the deciding factor between what emerges.
Of course, unless you have conflicts in the tests themselves.