Comment by josephg
5 years ago
Author of the blog post here. I’ll let you in on a dirty secret: I agree with you.
I see text editing as the beachhead for this tech. Text editing is hard enough that other systems don’t work well, so you’re kind of forced to use OT or CRDTs to make it work. And text documents are lists of characters - so once you’ve made it work there you have an implementation that can also sync arbitrary lists. At that point, editing maps (js objects) and adding support for arbitrary moves and reparenting will allow us to add real-time editing to a lot more software.
I think there’s a lot of interesting software architectures that can be enabled by making everything in a system a CRDT document. But text is still the beachhead. And a good litmus test for performance. And an important application in its own right.
on the one hand, the generality of the text editing solutions is really powerful, and I see what you mean in terms of that solution generalizing to other domains. But on the other hand, I always think back to how popular Memcache or Redis were even early on when they had very very few features. Just having a fast, in memory cache of strings empowered a lot of interesting new product development. I really wonder how much the average developer on a random project could get out of an appliance that lets you create and mutate an arbitrarily large number of values of the well-known, simple CRDT types like g-counters, pn-counters, 2P-sets, etc. Most of the literature is focused on "how do we merge the most complex data type", and not questions like "how do we manage the entire lifecycle of CRDT stores", "how does a CRDT store fit into most people's stack", "should CRDT types be added to existing stores or should developers expect dedicated CRDT-only stores", or "do people generally need to define their own CRDT types or do most people just want a box full of common ones". I hand-rolled my own CRDT setup just to get the most simple CRDT types because I didn't see anything out there that makes directly consuming CRDT types by application developers accessible. E.g., you make a g-counter and literally all a client can do with it is increment it or read its value. That's it. We have that, and it's totally useful! We also do entirely state-based replication because expressing the operations on the data would be so much larger than the data itself. But our situation is just so off-base for many people because clients are only ever interested in the current state (and never care about the past state), and we can safely just ignore the problem of when to delete the data; we just keep it around until you're finished playing the game, and then delete it when the game is over.
Etcd does some of this stuff, but in general it sounds like you have a useful opensource tool inside you that wants to be shared with the world. I can't wait to see it made.
How could CRDTs be used to collaborate in a project made with a visual programming language that consists of interconnected operators? Is it necessary to serialize this graph?