Comment by taeric
5 years ago
I hate that I am skeptical on this. I suspect wave just left that bad of a taste behind. So much hubris in what was claimed to be possible.
The ideas do look nice. And I suspect it has gotten farther than I give credit. However, sequencing the edits of independent actors is likely not something you will solve with a data structure.
Take the example of a doc getting overwhelmed. Let's say you can make it so that you don't have a server to coordinate. Is it realistic to think hundreds of people can edit a document in real time at the same time and come up with something coherent?
Best I can currently imagine is it works if they are editing hundreds of pages. But, that is back to the basic wiki structure working fine.
So, help me fix my imagination. Why is this the future?
In the case of a text document, concurrent edits form branches of a tree in many string CRDTs: http://archagon.net/blog/2018/03/24/data-laced-with-history/
So yes, hundreds of people can edit a string and produce a coherent result at the end. Contiguous runs of characters will stick together and interleave with concurrent edits.
CRDTs don't guarantee coherence, but instead guarantee consistency.
The result may often be coherent at the sentence level if the edits are normal human edits, but often will not be at the whole-document level.
For a simplistic example, if one person changes a frequently-used term throughout the document, and another person uses the old term in a bunch of places when writing new content, the document will be semantically inconsistent, even though all users made semantically consistent changes and are now seeing the same eventually-consistent document.
For a contrived example of local inconsistency, consider the phrase "James had a bass on his wall." Alice rewrites this to "James had a bass on his wall, a trophy from his fishing trip last summer," and Brianna separately chooses "James, being musically inclined, had hung his favorite bass on his wall." The CRDT dutifully applies both edits, and resolves this as: "James, being musically inclined, had hung his favorite bass on his wall, a trophy from his fishing trip last summer."
In nearly any system, semantic data is not completely represented by any available data model. Any automatic conflict-resolution model, no matter how smart, can lead to semantically-nonsensical merges.
CRDTs are very very cool. Too often, though, people think that they can substitute for manual review and conflict resolution.
Right. The problem CRDTs solve is the problem of the three-way merge conflict in git: the problem of the "correct" merge being underspecified by the formalism, and so implementation dependent.
If two different git clients each implemented some automated form of merge-conflict resolution; and then each of them tried to resolve the same conflicting merge; then each client might resolve the conflict in a different, implementation-dependent way, resulting in differing commits. (This is already what happens even without automation—the "implementation" being depended upon is the set of manual case-by-case choices made by each human.)
CRDTs are data structures that explicitly specify, in the definition of what a conforming implementation would look like, how "merge conflicts" for the data should be resolved. (Really, they specify their way around the data ever coming into conflict — thus "conflict-free" — but it's easier to talk about them resolving conflicts.)
In the git analogy, you could think of a CRDT as a pair of "data-format aware" algorithms: a merge algorithm, and a pre-commit validation algorithm. The git client would, upon commit, run the pre-commit validation algorithm specific to the file's type, and only actually accept the commit if the modified file remained "mergeable." The client would then, upon merge, hand two of these files to a file-type-specific merge algorithm, which would be guaranteed to succeed assuming both inputs are "mergeable." Which they are, because we only let "mergeable" files into commits.
Such a framework, by itself, doesn't guarantee that anything good or useful will come out the other end of the process. Garbage In, Garbage Out. What it does guarantee, is that clients doing the same merge, will deterministically generate the same resulting commit. It's up to the designer of each CRDT data-structure to specify a useful merge algorithm for it; and it's up to the developer to define their data in terms of a CRDT data-structure that has the right semantics.
1 reply →
So the CRDTs could be applied to a document and an edit/change log to guarantee the consistency of the log and its entries, not necessarily the document itself?
I upvote for the link alone. This article (data-laced-with-history) is the best source if you are starting your journey into CRDTs.
What if the document starts empty and syncing doesn't happen until everyone presses submit? Will it CRDTs produce a valid document? Yes. Will it make any sense? Who knows. I think that's what OP is getting at.
I read it as a question regarding OT vs. CRDTs, which I believe would produce similar results even under heavy concurrency. In terms of larger edits or refactors, you’d probably need to do something else, e.g. lock the document or section, unshare the document, use some sort of higher-level CRDT that ships your changes atomically and forces a manual merge on concurrent edits, etc. None of these necessarily require a central server, though they may require an active session between participants.
I should also note that even if you use regular merge, and the end state of a text document is a complete mess after a refactor + concurrent edits, there’s enough data in the tree to simply pull out any concurrent contributions. They could then be reapplied manually if needed. Perhaps the app could even notice this automatically and provide an optional UI for this process. Similarly, it would be possible for the concurrent editors to remove the refactor edits and thus “fork” their document.
2 replies →
> Is it realistic to think hundreds of people can edit a document in real time at the same time and come up with something coherent?
And here's the thing: Can 100 people edit a document, even in theory, and have it make sense? I think the answer is "no," with or without technology.
I'm sure there are other uses for these data structures, but shared editing is always the example I read about.
Ultimately I think the answer is “it depends” but the issue is that there is usually document structure which is mot visible in the data structure itself. For example imagine getting 100 people to fill out a row on a spreadsheet about their preferences for some things or their availability on certain dates. If each person simultaneously tries to fill in the third row of the spreadsheet (after the headings and the author), then a spreadsheet CRDT probably would suck at merging the edits. But if you had a CRDT for the underlying structure of this specific document you could probably merge the changes (eg sort the set of rows alphabetically by name and do something else if multiple documents have rows keyed by the same name).
It depends how big the document is, i.e. what is the density of users per page. If it's a 100 page document and the 100 users are all working on different sections, then it could easily be possible.
I just don't remotely see a use case for this. Real-time human collaboration in general fails at a scale much smaller than this, and not because of the tools available.
JoeDocs[1] could be a useful project to track related to this - the Coronavirus Tech Handbook[2] amongst other collaborative documents is now hosted by their service.
They utilize the same YJS[3] library mentioned in the article this thread discusses, and their GitHub repos include some useful working demonstration application code.
[1] - https://joedocs.com/
[2] - https://coronavirustechhandbook.com/
[3] - https://docs.yjs.dev/
A question I always have is if CDRTs solve some problem with collaborative editing, then can git's merge algorithm be rewritten to use CDRTs and benefit from it somehow?
Somehow I think the answer is no. There is a reason we still have to manually drop down to a diff editor to resolve certain kinds of conflicts after many decades.
I think a better question is “what if merges were more well behaved,” where “well behaved” means they have nice properties like associativity and having the minimal amount of conflict without auto-resolving any cases that should actually be a conflict.
The problem with using a CRDT is the CR part: there are generally merge conflicts in version control for a reason. If your data type isn’t “state of the repo with no conflicts” or “history of the repo and current state with no conflicts” but something like “history of the repo and current state including conflicts from unresolved merges” then maybe that would work but it feels pretty complicated to explain and not very different from regular git. Also note that you need history to correctly merge (if you do a 3-way merge of a history of a file of “add line foo; delete line foo” with a history of “add line foo; delete line foo; add line foo” and common ancestor “add line foo”, you should end with a history equal to the second one I described. But if you only look at the files you will probably end up deleting foo)
See also: darcs and pijul.
1 reply →
The answer is no, but unlike git, crdts make a choice for you, and all nodes get convergent consistency. The problem heretofore with crdts is that those choices have not been sane. I think there are a recent crop of crdts that are "95% sane" and honestly that's probably good enough. There is an argument that optimal human choices will never be reconciliable with commutativity, which I totally buy, but I think there is also an argument for "let not the perfect be the enemy of the awesome". And having made a choice, even if it's not optimal, is a much firmer ground to build upon than blocking on leaving a merge conflict undecided.
Git mostly treats merging as a line oriented diff problem. Even though you can specify language aware diffing in theory it doesn't seem to buy you much in practice (based on my experience with the C# language-aware diff).
It wouldn't make much sense to me to just plug a text CRDT in place of a standard text diff. CRDTs like automerge are capable of representing more complex tree structures however and if you squint you can sort of imagine a world where merging source code edits was done at something more like the AST level rather than as lines of text.
I've had some ugly merge conflicts that were a mix of actual code changes and formatting changes which git diffs tend not to be much help with. A system that really understood the semantic structure of the code should in theory be able to handle those a lot better.
IDEs have powerful refactoring support these days like renaming class members but source control is ignorant of those things. One can imagine a more integrated system that could understand a rename as a distinct operation and have no trouble merging a rename with an actual code change that touched some code that referenced the renamed thing in many situations. Manual review would probably still be necessary but the automated merge could get it right a much higher percentage of the time.
For specific use cases where the data format is not plain-text and or is formatted as json (e.g. Jupyter notebooks, rich text editing like ProseMirror), I can see CRDTs being used to automatically merge files and retain consistency. Two things though:
1. this doesn't require modifying git's merge algorithm itself; just having custom merge drivers built on top.
2. Is using more space (edit history of CRDTs vs a regular json document) worth it for automatic merging?
Depends on what kind of document we’re talking about I.e. how the grammar captures the domain model. Eg: A shared ledger in the case of digital currencies, or the linux source code being worked on remotely by many people are exactly examples of such documents.
Maybe if your "document" is the Encyclopedia Britannica? Wikipedia has hundreds of editors working at once, but that only really works because it's broken up into millions of smaller parts that don't interact much.
I meant this to be my takeaway. The data structure is nice. And I suspect it is a perfect fit for some use cases. I question the use case of shared editing. Not just the solution, but the use case.
Your key insight, which is spot-on, is that nothing can prevent human-level editing conflicts.
If I was going to take an attempt at justifying the importance of CRDTs, I would say:
CRDTs are the future because they solve digital document-level conflict.
They don't bypass the problem the way that diff/patch/git conflict resolution does, by requiring human intervention.
Instead they truly and utterly obliterate the digital conflict resolution problem: a group of people editing a document can separately lose network connectivity, use different network transports, reconvene as a subgroup of the original editors... and their collective edits will always be resolved automatically by software into a deterministic document that fits within the original schema.
If viable, this has far-reaching implications, particularly related to cloud-based document and sharing systems.
But how do they obliterate it? They just move the authority, no?
That is, say you get a hundred machines editing a document. They split into partitions for a time and eventually reunite to a single one. What sort of coherent and usable data will they make? Without basically electing a leader to reject branches of the edits, sending them back to the machines rejected?
There's no leader node necessarily required; each participant application in the session may have their own local copy of the document, and they apply edits to that using CRDT operations.
It's no doubt possible to construct application that don't behave correctly for certain combinations of edits -- but the datastructures themselves should be robust under any re-combination of the peer group's operations.
Edit / addendum: to phrase this another way and perhaps answer you more clearly: it's a responsibility of the application designer to come up with a document format for their application (and corresponding in-app edit operations) that will tend to result in 'sensible' recombinations under collaborative editing.
My sense so far is that this is the tradeoff; the complexity moves into the document format and edit operations. But that's a (largely) one-off up-front cost, and the infrastructure savings and offline/limited-connectivity collaboration support it affords continue to accrue over the lifetime of the software.
2 replies →
> sequencing the edits of independent actors is likely not something you will solve with a data structure.
Any multiplayer game does this. Git does this as well.
So of course you can do this, it's a matter of how you reconcile conflicts. Real-time interactive games will generally choose a FIFO ordering based on what came into the server's NIC first. Git makes the person pushing the merge reconcile first.
For docs, live editing seems to work the same as in games. Reconciliation for the decentralized workflow will be interesting, but it's just going to be minimizing the hit to a user when their version loses the argument.
But git doesn't do this. It punts to the user pretty quickly. (Even when it performs a merge, it is expected that the user confirmed the merge with a build. That is, git just claims the states combined without stopping in the same lines of the same files. There merge has to be verified by a user, from its perspective.)
Games, similarly, typically have a master state server. Yes, they optimistically show some state locally, but the entire process checkpoints with a central state constantly. (Else you get jerky behaviors pretty quickly as the states diverge more and more in ways that can't be reconciled without committing to a branch over another.)
Edit: that is, I would think the point is to force more, smaller arguments. Anything else puts more at risk as you lose one. Right?
You're agreeing with me on both points wrt git/games...?
The best way to prevent damage from arguments is to avoid them. Like how docs shows cursors so you know to avoid areas that are actively edited. Combined with the tiny incremental changes (often 1 char), users assume that any conflicts are due to each other instead of any distributed inconsistences.
2 replies →
"Twitch plays Google Docs" is always going to be incoherent, for social reasons. CRDTs can make it possible, they can't make it a good idea.
But for a contrived example, a game with hundreds of players, backed by an enormous JSON document, where the game engine is in charge of making sure each move makes sense: A CRDT could enable that, and each player could save a snapshot of the game state as a simple text file, or save the entire history as the whole CRDT.
Or as a less contrived example, instead of a game, it's a chat client, and it provides rich text a la Matrix, but there's no server, it's all resolved with CRDTs and all data is kept client-local for each client.
There are a lot of cool things you can build with a performant CRDT.
> Why is this the future?
Here is an interview with someone using CRDTs to build an edge state product that answers this question at a high level.
https://www.infoq.com/articles/state-edge-peter-bourgon