← Back to context

Comment by danielvaughn

14 days ago

CRDTs work well for linear data structures, but there are known issues with hierarchical ones. For instance, if you have a tree, then two clients could send a transaction that would cause a node to be a parent of itself.

That said, there’s work that has been done towards fixing some of those issues.

Evan Wallace (I think he’s the CTO of Figma) has written about a few solutions he tried for Figma’s collaborative features. And then Martin Kleppmann has a paper proposing a solution:

https://martin.kleppmann.com/papers/move-op.pdf

Martin Kleppmann in one of his recent talks about the future of local-first, mentions the need for a generic sync service for the 'local-first end-game' [0] as he calls it. Standardization is needed. Right now everyone and their mother is doing sync differently and building production platforms around their own protocols and mechanisms.

[0] https://www.youtube.com/watch?v=NMq0vncHJvU&t=1016s

  • The problem is that the requirements can be vastly different. A collaborative editor is very different to say syncing encrypted blobs. Perhaps there is a one size fits all but I doubt it.

    I've been working on sync for the latter use case for a while and CRDTs would definitely be overkill.

Automatic conflict resolution will always be limited. For example, who seriously believes that we’ll ever be able to fully automate the handling of merge conflicts in version control? (Even if recorded every single edit operation on the syntax-tree level.) And in regular documents the situation is worse, because you don’t have formal parsers and type checkers and unit tests for them. Even for schematized structured data, there are similar issues on the semantic level, that a mere “it conforms to the schema” doesn’t solve.

  • Indeed. So conflict resolution that takes input from the user needs to be part of the protocol. Just like in Git.

    • or from the LLM. It can't be super complicated to describe a logic schema to do a merge like that. It's just a business logic vs universal logic.

As long as all clients agree on the order of CRDT operations then cycles are no problem. It's just an invalid transaction that can be dropped. Invalid or contradictory updates can always happen (regardless of sync mechanism) and the resolution is a UX issue. In some cases you might want to inform the user, in other cases the user can choose how to resolve the conflict, in other cases quiet failure is fine.

  • Unfortunately, a hard constraint of (state-based) CRDTs is that merging causally concurrent changes must be commutative. ie it is possible that clients will not be able to agree on the order of CRDT operations, and they must be able to arrive at the same state after applying them in any order.

    • I don't think that's required, unless you definitionally believe otherwise.

      When clients disagree about the the order of events and a conflict results then clients can be required to roll back (apply the inverse of each change) to the last point in time where all clients were in agreement about the world state. Then, all clients re-apply all changes in the new now-agreed-upon order. Now all changes have been applied and there is agreement about the world state and the process starts anew.

      This way multiple clients can work offline for extended periods of time and then reconcile with other clients.

      3 replies →