Comment by yccs27
14 days ago
For me the main issue with CRDTs is that they have a fixed merge algorithm baked in - if you want to change how conflicts get resolved, you have to change the whole data structure.
14 days ago
For me the main issue with CRDTs is that they have a fixed merge algorithm baked in - if you want to change how conflicts get resolved, you have to change the whole data structure.
I feel like the state-of-the-art here is slowly starting to change. I think CRDTs for too many years got too caught up in "conflict-free" as a "manifest destiny" sort of thing more than "hope and prayer" and thought they'd keep finding the right fixed merged algorithm for every situation. I started watching CRDTs from the perspective of source control and having a strong inkling that "data is always messy" and "conflicts are human" (conflicts are kind of inevitable in any structure trying to encode data made by people).
I've been thinking for a bit that it is probably about time the industry renamed that first C to something other than "conflict-free". There is no freedom from conflicts. There's conflict resistance, sure and CRDTs can provide in their various data structures a lot of conflict resistance. But at the end of the day if the data structure is meant to encode an application for humans, it needs every merge tool and review tool and audit tool it can offer to deal with those.
I think we're finally starting to see some of the light in the tunnel in the major CRDT efforts and we're finally leaving the detour of "no it must be conflict-free, we named it that so it must be true". I don't think any one library is yet delivering it at a good high level, but I have that feeling that "one of the next libraries" is maybe going to start getting the ergonomics of conflict handling right.
This seems right to me -- imagine being able to tag objects or sub-objects with conflict-resolution semantics in a more supported way (like LWW, edits from a human, edits from automation, human resolution required (with or without optimistic application of defaults, etc).
Throwing small language models into the mix could make merging less painful too — like having the system take its best guess at what you meant, apply it, and flag it for later review.
I just want some structure where it is conflict-free most of the time but I can write custom logic in certain situations that is used, sort of like an automated git merge conflict resolution function.
I've been running into this with automated regex edits. Our product (Relay [0]) makes Obsidian real-time collaborative using yjs, but I've been fighting with the automated processes that rewrites markdown links within notes.
The issue happens when a file is renamed by one client, and then all other clients pick up the rename and make the change to the local files on disk. Since every edit is broken down into delete/keep/insert runs, the automated process runs rapidly in all clients and can break the links.
I could limit the edits to just one client, but it feels clunky. Another thought I've had is to use ytext annotations, or just also store a ymap of the link metadata and only apply updates if they can meet some kind of check (kind of like schema validation for objects).
If anyone has a good mental model for modeling automated operations (especially find/replace) in ytext please let me know! (email in bio).
[0] https://system3.md/relay