CRDTs are the future

5 years ago (josephg.com)

I'm part of the team that makes Zoho Writer (a Google Docs alternative) - https://news.ycombinator.com/item?id=24186883

However, in the spirit of "There are no solutions, only trade-offs" CRDTs are absolutely necessary for certain type of syncing - like syncing a set of database nodes.

But for systems which already mandate a central server (SaaS/Cloud) and especially for a complex problem like rich-text editing (i.e semantic trees) I still think OT provides better trade-offs than CRDT.

I respect Joseph's conviction on CRDTs being the future, so I guess we'll figure this out sometime soon.

  • My small startup company went with Zoho office at first because of the price. But the features is what has us looking to stay for a while.

    One thing I would love to see is the addition of wildcard addresses like the way google has and microsoft added (user+site_string@domain.com).

    Thanks for your hard work on a great product!

    • The Zoho ecosystem is this weird place where you can find almost everything, virtually for free. If you’ve never looked before, check it out - it’s expansive.

      Frustratingly though, there are so many features heaped in that there is no cohesion. Things are frequently buggy, unreliable and disjointed. I’d almost be able to forgive it but unfortunately the support is really terrible too.

      I assessed a lot of crm software and each one I kept finding things they didn’t have that zoho had but for the reasons above we ultimately chose something else. Which is a shame, because I would pay them a lot more than they ask, for them to just be a little better.

      1 reply →

  • We went with OT five years ago in CKEditor 5 and we have the same experience. While it would be tempting to try CRDT, it seems to come with too significant tradeoffs in our scenario.

    My colleague wrote down our thoughts in https://news.ycombinator.com/item?id=24621113. With OT, if we get the corner cases right, we can tune and extend the implementation. With CRDT that might be game over.

  • Interesting. I might be adding real-time edit syncing to a hobby project sometime soon. Can you share more about the trade-offs?

    • I haven't yet completely watched Martin's talk on CRDTs, so I might come back and stand corrected. For now these are some well known trade-offs

      A central server: Most OT algorithms depend on a central system for intention preservation. CRDTs are truly distributed and need no central server at all.

      Memory: Traditionally CRDTs consume more memory because deletions are preserved. OT lets you garbage collect some operations since a central system is already recording those ops and sequencing them as well.

      Analysing and cancelling ops: OT lets you easily analyse incoming ops and modify/dummy-ify/cancel them without breaking the consistency. This convenience is not necessary for most cases, but really important for rich-text editing. For example when someone merges a couple of table cells when another user is deleting a column, we need to analyze these operations and modify them so as not to end-up with an invalid table structure.

      8 replies →

    • Dealing with text is still an active area of research for CRDTs. While the problem has been theoretically solved, the solutions require much more memory/bandwidth than OT does.[1] Conversely, CRDTs are significantly better at replicating graphs.

      yjs[2] is one CRDT that handles text reasonably well, but it can still run into performance edge cases (as they plainly/honestly admit in their README).

      [1]: https://github.com/automerge/automerge/issues/89 [2]: https://github.com/yjs/yjs

      1 reply →

    • The transform operation is more simple if you know the order of things. For example in OT: nr2) Delete H from index 0. nr1) Insert "Hello" at index 0. You know that nr1 should come before nr2 because of a central counter. But with CRDT it's a) Delete character id 0, b) Insert "Hello" at character with id 0.

      2 replies →

  • What does OT stand for?

    In the link, OT is aliased to "Operational Transformations"

    • "Operation Transformation" = "a system that supports collaboration functionalities by separating the high-level transformation (or integration) control from the low-level transformation functions"

      Source: OT's Wikipedia article

      But I felt the same. Never heard of "Operation Transformation" before and both OT and its alias were equally opaque to me.

  • Don't click on a link if you're unsure - from the title or URL - the content is relevant to you.

    It's equally "disrespectful" to waste reader's time on 101 content if that's now what the post is about.

I'm going to come clean, I am a shitty software engineer.

I'd like to have things like CRDTs under my radar to pull from when architecting technical solutions. But I don't.

I have been coddled by the bullshit of "just-good-enough" web development for nearly a decade, and I feel like I will be haunted by it forever at this point.

I WANT to employ mathematically proven solutions, but not only has it been 10+ years since studying computer science (and with it, the relevant mathematics) but I never even graduated.

So here I am, another under-educated over-paid GED holding Rails engineer putting together shit solutions I know will eventually leak.

So, having said all of that, is there hope for adding things like this to my toolbelt? And where do I even start? I feel like going back to school full-time is out of the question, because I have a life now. Maybe online classes? I'm curious to hear thoughts from people who maybe came from a similar pit of despair, or otherwise understand what I am talking about.

It's just... sad to know enough that I produce less than optimal work but don't know enough to confidently prevent it from happening over and over and over again. Is this just the profession I have locked myself into?

  • For every million bricklayers, there may be one or two material scientists devising new clay formulations to make stronger, cheaper bricks.

    The irony is that the bricklayers are often paid better, and they can be certain that their profession will remain in demand for the foreseeable future.

    The material scientist may hit a dead end, or their company may axe their entire department to "cut costs".

    One of the smartest human beings I've ever met ended up at Google, working on something very similar to CRTDs for the Google Wave project. The outcome was a commercial failure and his team was disbanded.

    I did the same classes as him at University, and I use almost none of that knowledge in my rather pedestrian job. I make nearly twice what he does and I have nearly perfect job security. So there's that.

    • This is exactly why I've focused on boring, mainstream tech skills. I'd rather be doing audio dsp or machine learning or fancy AR apps but I figure vanilla js/web chops are going to be in demand for the foreseeable future and pay well enough to support a decent lifestyle.

      5 replies →

    • Google Wave used OT, though, not CRDTs. Are you saying that Wave also tried to use CRDTs, but failed?

      Wave was disbanded, but the technology was apparently integrated into other things such as Google Docs, which uses OT for collaboration quite succesfully (despite the criticisms in OP's article).

  • Author here. I love this comment.

    > So, having said all of that, is there hope for adding things like this to my toolbelt?

    Yes; but not yet. Right now if you want you can use yjs or sharedb. But the APIs aren't spectacular. Eventually I want these things integrated into svelte / react / swiftui and postgres / sqlite / whatever so it just sort of all works together and we get nice tutorials taking you through what you need to know.

    We aren't there yet. Automerge is slow (but this is being worked on). Yjs is hard to use well with databases, and its missing some features. We'll get there. The point of all the mathematical formalisms is that if we do it right, it should just work and you shouldn't have to understand the internals to use it.

    • What are you missing from Yjs?

      I'm sharing the vision with you that things should just work. It literally takes you two lines of code to make any of the supported editors collaborative. The Yjs types are a minimal abstraction over the crdt model. I hope that other people build more abstractions around Yjs (for example a react store that simply syncs).

      The thing is.. all of what you are talking about is already possible. A wasm port will eventually bring some performance improvements. But Yjs is already very close to optimal performance and it works in all browsers.

      But I really am interested in more feature request. I'm mainly interested in building collaborative apps with CRDTs. CRDTs as a data model is an entirely new concept and we need to get experience on how to build apps with CRDTs. And also we need to find out what is still missing.

      So head over to discuss.yjs.dev and share your experience.

      2 replies →

    • Loved your blog - but I think the parent's question was how to gain the skills of working/designing OT/CRDTs etc on their own. (I am clarifying the question - since I'm interested in the answer as well)

      4 replies →

  • You are not a shitty software engineer who's been "coddled." What you likely are is dissatisfied with where your career is going (and who amongst us is not in some way) and nervous about not having a name brand pedigree so you are turning it inward and trying to understand it in ways that imply the cause and effect are localized to yourself. That's understandable but I wonder if it will get you anywhere. Stop beating yourself up and start exploring what you really want.

    Do you want a brand name? Go work at a FAANG. Trust me, they're hiring. You'll get there and like me you'll realize they too have their problems and maybe the "over-paid Rails shit solutions" you "know will eventually leak" maybe weren't so bad after all.

    Do you want to level up as an engineer? Go work at a hyper-growth startup (probably series B) that is falling over from its scale problems. You will probably end up solving some oddly challenging and novel engineering problem along the way.

    Do you actually not mind this stuff but hate feeling like you have to always keep up with the Jones and it gives you anxiety? Go find a therapist that you have good chemistry with and see if you can't work out why you feel this way and how to fix it. To be honest, the corporate rat race thrives at making people who are otherwise doing quite well with their career progression feel like they're not because that makes them easier to exploit. Life is not an exam to be min-maxed. You don't take that fancy career or lifetime earnings with you when you die, so unless you're doing it for the intrinsic joy of doing it, there's an argument to be made that becoming better at engineering has no guarantee of making your life better.

    See this blog post for just what it is -- a breathless discovery of a hammer and an author who now wants to use it for everything but may not be at the point where they realize it might not be a great tool for everything. You the person are probably a lot more competent and capable than you're letting on here. And if that's the case and you're unhappy for reasons in your control, hone in on why that is and see if you can't make life moves to change it. There's never been a better time to make a career move.

    Good luck. I believe in you.

    •   To be honest, the corporate rat race thrives at making people who are otherwise doing quite well with their career progression feel like they're not because that makes them easier to exploit. Life is not an exam to be min-maxed. You don't take that fancy career or lifetime earnings with you when you die, so unless you're doing it for the intrinsic joy of doing it, there's an argument to be made that becoming better at engineering has no guarantee of making your life better.
      

      This. Especially that last line. I feel most of the time I'm just wanting to get more and more, both financially and challenges wise. But to what end? Your words here really calmed me down. I got me thinking. I felt like a dog running after a car.

      1 reply →

  • When this guy discusses OT vs CRDT he's discussing algorithms for some very specific use cases. Namely, concurrent document editing a la Google Docs. Realistically, there's maybe 50 people in the world who are versed in these kinds of algorithms at a really deep level. 99% of us are users. If I was pressed into it, I maybe have the chops to implement a reference implementation for something like this, but I've never come with a country mile of an organization that has the resources to invest in something like that. You can be a great programmer with a very successful career as someone who only knows how to read an API and use it. And unless you are tasked with creating a concurrent document editor, you will probably never need to understand the API for a CRDT.

    • generally agree, but I bet there is more like 500-5000 people that understand this kind of stuff. Most modern web tools support at least some basic functionality of synchronously working on the same document.

      1 reply →

  • I wrote an article for folks like you (because I also felt like this was my life at one point).

    In this article I explain the basics of the simplest CRDTS the way a web developer without a deep knowledge of CS could relate to it: https://statagroup.com/articles/editing-shared-resources-crd... . I don't present the concept 100% fully, but I give a starting point for someone who "only knows rails". I found most online tutorials dive directly into the theory.

    I included all the resources I used to learn the concept on the page too.

    It's part of a series I wrote about editing shared resources ( https://statagroup.com/articles/editing-shared-resources ).

  • I feel ya, I did not graduate either, and didn't go for CS. After having Linux as a hobby, I made my way into a web dev job. The "good enough" work is killing me. I feel like I am pushing and pulling my team to the light, even simple things like DRY.

    I think one thing I'm learning is where to spend my effort. Its definitely a balance, but I'm putting more time/passion into my personal projects/learning, whereas before I saw more opportunities for overlap with my passion and work. When your work doesn't recognize that effort or actively fights against those efforts, it's draining and frustrating.

    I've also been thinking about dedicating some serious time to learning. I keep thinking about the idea of taking a month or two off from work. Just treat myself to my own mini-semester of homeschool. I'm somewhat confident I could learn enough to land a better job, but more importantly, one I enjoy more.

    • You guys are overestimating the degree to which concepts from CS curricula are relevant to the average engineer's work. "Pushing and pulling your team to the light" is normal, can't be solved by getting a CS degree, and is more about your team dynamics than anything.

  • No, you have not locked yourself into anything, you can always grow. Comfort can breed complacency, so my initial advice is to quit your job and try to get a new one that's more demanding. It forces you to kick your own butt into learning new stuff.

    I am an under-educated over-paid GED holding Glorified Sysadmin, and while a lot of the shit solutions I work on will eventually leak, I do spend a non-trivial amount of time looking for better solutions. I also try to keep my ears open for what other people are working on and read up on concepts I'm unfamiliar with, whether they crop up on HN or through work.

    I learned about CDRTs several years ago at a former employer where a team mate was talking about how it was the future. However, we didn't end up using them. It turns out the more advanced your system gets, the more work goes into it, and the more of a pain in the ass it becomes to run [until it reaches a certain level of operational stability]. So I wouldn't say that using the most bleeding edge research is a good idea even if you know about it. Most people are still fucking up the basic stuff we were doing 10 years ago, and the neckbeards here probably have loads of stories about how much simpler and more stable distributed systems used to be.

  • > So, having said all of that, is there hope for adding things like this to my toolbelt? And where do I even start?

    HN is a decent place to start. Now that you've read this article, you're aware of the general landscape of concurrent document editing algorithms.

    Ultimately, all you really need is the language to a) describe your problem and b) compare potential solutions. I have a CS degree, and it's not like we spent semesters learning every algorithm under the sun. But what we did learn was the language to describe and compare the performance of algorithms (computational complexity theory). Beyond that, it's just about knowing enough terminology to be able to describe your problem, which you could learn just by skimming through an algorithms textbook (or random Wikipedia articles, honestly).

    For example: a friend of mine was recently trying to devise an algorithm for ordering buff applications for a collectible card game. Once I was able to find the right language to describe the problem (topological sorting), finding an algorithm to use was easy.

  • The article mentions Martin Kleppmann. Go buy a copy of his "Designing Data-Intensive Applications". It may well have been titled "Practical Distributed Systems for the Everyday Developer". It is an absolutely fantastic book with a perfect ratio of theory/practice.

    Extra points for buying a dead tree copy and reading it without a thousand alerts and internet temptations vying for your attention :)

  • Sounds like you've got an awareness of CRDTs now. That's a win and a differentiator from many other engineers.

    I already knew about them, but there will be a stack of other stuff I don't know about. Just get Googling when you hit different problem areas.

    I found out about CRDTs because I was working in a niche where a security device was used that breaks TCP between two hosting regions for an application. So I searched for issues to do with that and refreshed myself on the two general's problem, Byzantine fault tolerance, 2 and 3 phase commits and at some point OTs and CRDTs popped up. You'll find this stuff if you keep looking at a problem for long enough.

    That was a while ago though and despite still working in that area I've still not implemented anything with them because I'm a contractor and I'm under the direction of people who don't have CRDTs in their toolbelt and don't want ideas from others! I guess my cheerful message is keep at it but know you can only do so much and there's plenty of other pits of despair to fall into even when you're out of the current one. Yay!

  • I never graduated, but that was back in the 80s. The important part is not getting "official" education, but learning how to self-learn. You don't need to necessarily understand the underlying maths, if you can understand the general theory.

    And you're not a "shitty software engineer" if you are self-aware enough to know what you don't know. Read, start with Wikipedia, look up the references, play with stuff, explore.

    If you need to understand certain things at a deeper level, then explore the classes and other guided tuition.

    I've been doing this stuff for 30 years and I still learn every day. You will produce less than optimal work for ever. As a master craftsman (which is what you are, not an "engineer") you are always learning and improving your craft.

    You can be satisfied when completing some work if you did your best to produce it and it works within the constraints you had. There will always be things you would do different or better, so do that next time.

  • CRDTs are just applied partial order theory. Partial orders are just mazes where you can drag your finger in a certain vague direcion, taking arbitrary turns in that vague direction, and still find the exit.

    Get your intuition down really good and 10+ years is not required.

  • I can give a philosophical answer rather than a technical one. Our lives are filled with learning various things. And we all realize there is no end to it. But truly what makes you feel better about what you do is not this endless learning. It is when you can drop what you learnt and think freely about a problem that you can solve. Until you reach there, you are in a boat tossed by the waves trying to grab whatever you can to keep yourself afloat. Please disregard this answer if it is not of any help.

  • Picking up Elixir is a decent way to get into backend distributed systems from a Ruby/Rails background.

    • I think this is surprising levels of true. Elixir really works as an on-ramp from web to a lot of other stuff.

  • I love the candor in your post but you're being too hard on yourself.

    Learn functional programming, gradually. You'll get there. I like to think I reverted a decade of brain damage inflicted by enterprise java development in three years.

  • I've worked for multiple Rails shops with high quality engineers. The trick is to interview them too before accepting any offer. The best teams I've been on have been small. Past a team/project size, maintaining high quality code becomes much more difficult.

  • Check out http://www.roomservice.dev

    Turns your front end app multiplayer with a library.

    They manage the hard part and give you a simple API to get the benefits of realtime/syncing/diffing.

    • > The major downside of having an “always syncable” data structure is that it gets really big, really quick

      Perhaps it would be prudent to read the actual article before tossing off this sort of comment since it's indeed the whole point?

      1 reply →

  • This is my daily struggle routine. I cannot get rid of the idea that proceeding my career only accumulate how much I earn, instead of making myself a better programmer.

  • > And where do I even start?

    There are so many resources available to learn CS today. Are you really baffled about how to learn outside of school? Read. Practice. Repeat.

The three most recent HN discussions on CRDTs are all worth perusing.

[1] is an excellent tutorial that assumes no initial familiarity with CRDTs or the math that underpins them. It walks you through both the formalisms and the implementation, which is pretty key to understanding why making real-world CRDTs flexible enough to handle things like rich text editing is hard.

[2] is a talk that goes more in-depth on the hard parts

[3] goes deeper on OT vs. CRDT

It's worth noting that many of the CRDT discussions focus on collaborative text editing. That's a really hard problem. CRDTs are (and have been for some time) a useful primitive for building distributed systems.

[1] https://news.ycombinator.com/item?id=22039950

I use CRDTs in production at Jackbox for audience functionality and honestly I don't know why the only thing that people talk about when it comes to CRDTs is collaborative text editing. Like, sure, cool, but that's literally a single problem domain and like, Google Docs already exists and works well for the majority of users so how many developers actually need to create a collaborative text editor? CRDTs are an incredibly abstract and versatile concept; collaborative text editing is a tiny problem domain. I would really like to see more writing about CRDTs that is NOT about collaborative text editing.

  • Author of the blog post here. I’ll let you in on a dirty secret: I agree with you.

    I see text editing as the beachhead for this tech. Text editing is hard enough that other systems don’t work well, so you’re kind of forced to use OT or CRDTs to make it work. And text documents are lists of characters - so once you’ve made it work there you have an implementation that can also sync arbitrary lists. At that point, editing maps (js objects) and adding support for arbitrary moves and reparenting will allow us to add real-time editing to a lot more software.

    I think there’s a lot of interesting software architectures that can be enabled by making everything in a system a CRDT document. But text is still the beachhead. And a good litmus test for performance. And an important application in its own right.

    • on the one hand, the generality of the text editing solutions is really powerful, and I see what you mean in terms of that solution generalizing to other domains. But on the other hand, I always think back to how popular Memcache or Redis were even early on when they had very very few features. Just having a fast, in memory cache of strings empowered a lot of interesting new product development. I really wonder how much the average developer on a random project could get out of an appliance that lets you create and mutate an arbitrarily large number of values of the well-known, simple CRDT types like g-counters, pn-counters, 2P-sets, etc. Most of the literature is focused on "how do we merge the most complex data type", and not questions like "how do we manage the entire lifecycle of CRDT stores", "how does a CRDT store fit into most people's stack", "should CRDT types be added to existing stores or should developers expect dedicated CRDT-only stores", or "do people generally need to define their own CRDT types or do most people just want a box full of common ones". I hand-rolled my own CRDT setup just to get the most simple CRDT types because I didn't see anything out there that makes directly consuming CRDT types by application developers accessible. E.g., you make a g-counter and literally all a client can do with it is increment it or read its value. That's it. We have that, and it's totally useful! We also do entirely state-based replication because expressing the operations on the data would be so much larger than the data itself. But our situation is just so off-base for many people because clients are only ever interested in the current state (and never care about the past state), and we can safely just ignore the problem of when to delete the data; we just keep it around until you're finished playing the game, and then delete it when the game is over.

      1 reply →

    • How could CRDTs be used to collaborate in a project made with a visual programming language that consists of interconnected operators? Is it necessary to serialize this graph?

  • I love the Jackbox games.

    However in my experience most games's shared world state cannot be reconciled if one player's action is inserted in time prior to another player action. Or future actions reveal hidden information.

    Even turn based games have turn timers. Including Jackbox games!

    Among public databases there may only be two correct CRDT implementations in the world - Cassandra's and etcd's. Compare to how many database products make CRDT or CRDT-like promises, and how few probably actually work.

    Lots of software promises CRDTs-like-over-actors, e.g. Erlang, the new Cloudflare product. Not really qualified to talk about them.

    Most applications normal people use boil down to chat conversations - chat apps, Gmail, Facebook, collaborative text like you said, all sorts of log-like things - or serial, recommendation-driven consumption - YouTube, Netflix, which use incremental / batched cells in a matrix to update their recs. This stuff is insensitive to order and time. Then there's e-commerce, which is inherently linear, centralized, single-actor, all-or-nothing transactions, CRDTs provide no advantages here.

    It's tricky. On the one hand you'd get this interesting discussion of CRDT applications. On the other it may wither under intense scrutiny.

  • We want something like that for distributed collaboration in Ardour, a cross-platform DAW. The relevant state is serialized as an XML file, and used natively as a complex object tree. Users want to be able to edit (in the DAW) locally, then share (and merge) their results with collaborators.

    • I've plugged this collaboration project a few times recently, and have no relationship to it other than discovering it (via YJS' "who is using" list[1]) and finding it fascinating:

      http://cattaz.io/

      What I find most interesting about it is that it has reduced the state of multiple 'smart' user-facing widgets/apps into a common, lowest-common-denominator format (a text document) that lends itself more easily and intuitively to collaborative editing and CRDT operations.

      I don't know for sure whether this is the path forward for CRDT-based applications in general, but I think there are valuable ideas there. It does raise the possibility of the widgets/applications occasionally being in 'invalid' states; but rarely in a way that the human participants wouldn't notice or be able to fix themselves.

      Whether that scales to the complexity of the state management for a multi-track audio editing session, I don't know; but it could be instructional to compare.

      [1] - https://github.com/yjs/yjs#who-is-using-yjs

    • in real-time? Well, I have thoughts, but I'm not super familiar with Ardour itself, so I'm not sure if you're trying to merge during a live performance or if you're talking more of a distributed studio recording session type situation. I have working knowledge of Reason and Logic and ChucK (which I use with JACK and do some networked OSC stuff with, although I haven't touched it in a few years).

      The approach we use at Jackbox for making the state of an xbox game mutable to thousands of live viewers on twitch is to have lots of little CRDT values, mostly just counters and sets of strings, and you merge the little values independently of one another, which is very different from the situation of editing a text document, which is typically structured as one big value. I wonder if, for a DAW, you could merge at the track or control level instead of the workspace level. E.g., communicate as an independent value the state of an individual fader, and communicate either states or operations on that fader and have each client merge them. In this example, the fader's state would be encoded as a PN-counter with a range condition, and you'd replicate increment and decrement options, like it was a networked rotary encoder. So every mutable thing in the DAW would be a value having operations that can be merged, instead of having a single big value representing the entire state of the DAW. My use-case is also funky because I have potentially thousands of writers writing to the same key, but only a single reader, and the reader doesn't need an update at every change, so I use state-based CRDTs, but I think most other people using CRDTs use operation-based CRDTs. Also not sure how you would mutate two separate values transactionally or if that's a thing you even need.

      1 reply →

  • We use CRDTs for having a scalable, availability first, service discovery implementation. I’m sure there are more uses out there. We use Akka Distributed Data and there are many users of that.

  • just wanted to say, i really like Jackbox's games and its really fun to see you here! i've been exposed to your games via youtube vids and it feels weird (but cool!) to have that intersect with HN. would love to read more about how you use CRDTs to make it all work! (i'm pretty clueless re: multiplayer games so even a high-level explanation would be interesting)

  • Holy shit CRDTs totally make sense for Jackbox. (Btw thank you for the awesome games!)

  • Thanks for your work on Jackbox! Never had an issue in the games and they are perfect for parties. I always thought it would be fun to make a Jackbox-type game system.

If you're a young technical entrepreneur looking for a 10-100M startup opportunity and with a very interesting technical challenge behind it: Create a collaborative replacement of Jupyter Notebooks. There's already some effort done in JupyterLab fork if you're interested [0], but with no significant advancements.

So yes, I agree that CDRTs are indeed a promising endeavor.

[0] https://github.com/jupyterlab/jupyterlab/issues/5382

  • CoCalc is a collaborative replacement of Jupyter notebooks. It's a top-to-bottom re-implementation of the entire Jupyter stack designed specifically for realtime collaboration. You can use it via our hosted offering (https://cocalc.com), or install it on prem via https://github.com/sagemathinc/cocalc-docker.

    We released the our collaborative Jupyter notebook in 2014 as a plugin to Jupyter classic. We then iterated on what we learned over the years, completely rewriting everything multiple times, including the entire realtime collaboration stack. Cocalc's Jupyter support is pretty mature and battle tested at this point, and also includes a TimeTravel slider that lets you view all past versions of a Jupyter notebook and integrated chat.

    I was a college professor (at Univ of Washington), I started a company around this in 2015, so CoCalc has soo far been mainly aimed at serving the needs of academics teaching courses. It's been increasingly popular lately, e.g., in the last month over a half million distinct Jupyter notebooks were edited on https://cocalc.com. Of course, many of these notebooks are homework problems. Anyway, our company is doing very well, and we hope it will eventually be a "10M startup opportunity". :-)

  • Domino Data Lab has been around for a while and closed another $43M in funding earlier this year. They have a boatload of tools around collaborative notebooks. They go even further and have data science manager-level dashboards to track the notebooks, their resources, and who is working on what. There are others, but I'm calling this company out specifically because they've shown great traction and I've spent a little time with the cofounders when they were still at a shared incubator space.

  • Is it really such a good idea to entice young people like this? Shouldn't someone at least be interested and have domain knowledge in CRDTs and real-time collaboration before diving into building a startup like this?

    • There's no need to gatekeep building something on already having knowledge.

      If someone has time and energy and desire, not knowing anything about document editing or CRDTs is not a blocker. Those things can be learned in a week to a month by someone who dedicates time to it.

      Very few parts of software are inaccessible to someone with basic CS knowledge. It's a great idea for people to try something, regardless of their background, and if they fail but learn something, that's still a fine outcome.

    • Worked for Figma, right?

      I'm sure they fall into the collaborative software space, utilise CRDTs and the founders are less than 40 years of age.

      This seems like gatekeeping no?

A question that's been in my mind for a while is why Version Control and Collaborative Editing work at such cross purposes with each other when they are essentially solving the same problem? The biggest difference is that one works interactively and the other favors a CLI. Beyond that, how much of the distinction is artificial?

In particular I've been wondering about the space between CRDTs and the 'theory of patches' such as we discussed with Pijul the other day.

I have a collaborative editing project that's been sitting in my in-box for a long time now because I don't want to write my own edit history code and existing tools don't have enough ability to reason about the contents as structured data. The target audience is technology-averse, so no 'dancing bears' are going to interest them. It's not enough for it to work, it has to work very well.

  • As it stands today, version control and collaborative editing do not solve the same problem. Version control deals with large chunks of changes at a time. I don't even particularly want a version control system that stored every single keystroke made in source code. [1] Collaborative editing deals with keystroke-by-keystroke updates. By the standard of collaborative editing, even a single line source control commit is a big change.

    The problem spaces are quite different. Problems that emerge on a minute-by-minute basis in collaborative editing emerge on a week-by-week basis in source control, and when the problems emerge in the latter, they tend to be much larger (because you can build up a much bigger merge conflict on a routine basis with the big chunks you're making).

    Yes, it's true that if you squint hard, it looks like version control is a subset of collaborative editing, but I'd be really hesitant to, say, try to start a start-up based on that observation, because even if we take for the sake of argument that it's a good idea to use the same underlying data structures, the UI affordances you're going to need to navigate the problem space are going to be very different, and some of the obvious ways of trying to "fix" that would be awful, e.g., yes, you could give me a "collaborative space" where I see what everybody's doing in their code in real time... but it's a feature, not a bug, that when I'm working on a feature I'm isolated from what everyone else is doing at that exact moment. When I run the compiler and it errors out, it's really, really nice to have a good idea that it's my change that produced that result.

    (I'm aware that collaborative editing also has the "I was offline for a week and here's a bunch of conflicts", but I'm thinking in terms of UI paradigms. That's not the common case for most/all collaborative editing systems.)

    [1]: Not saying the only solution is the one we had now. A magic genie that watched over the code and made commits for you at exactly the right level of granularity would be great, so you'd never lose any useful context. But key-by-key isn't that level of granularity.

    • Version control is collaborative editing. Synchronizing on every key stroke is real-time collaborative editing. That's nice if you're working on a overlapping data at the same time. In code this does not happen so often because code repositories tend to be large.

      Git does not work well for text because we have not figured out a nice format for text yet that developers and other people both enjoy. Developers want to stick to plain text as their format because we have so far failed to create nice tools and formats for structured data. Perhaps these affordances can appear thanks to a popularization of real-time collaborative editing.

      4 replies →

    • One of the reasons we compartmentalize code is so that people can work on unrelated features without tripping over each other at every turn.

      The bits where they don't interact also don't conflict. The bits where they do, look a lot more like collaborative editing.

      They're also the spots where merges usually go wrong.

      5 replies →

  • Around 6-7 years ago we started a collaborative editing project for prezi.com. The problem basically boiled down to concurrent editing of a big DOM-like data-structure. We looked at the little literature that was available at the time including OT and CRDTs, but quickly realized that none of the existing approaches were mature enough for our needs. All of them were stuck at "text editing", but we needed to edit these big object DAGs.

    So we ended up essentially implementing what you laid out, an in-memory revision control system, although using a bit more formal methods to reason about divergence/convergence of clients. The most basic operation was the "diamond merge": given operation x:A->B, y:A->C, construct x':C->D, y':B->D such that x' . y == y' . x It also had to satisfy certain other algebraic laws, notably diamond composition, which allowed us to compose these merging operations whenever we wanted, guaranteeing that the clients will eventually converge to the same data state. It was quite neat! Shame that it's all proprietary.

    Good old days. I remember, the most pesky operation was implementing a good undo-redo algorithm, it's quite tricky, even once you add inverses.

  • Strong agree.

    There's a next level of VCS forming on the horizon, in some combination of CRDTs, patch theory, and grammar-aware diffing.

    Which should also learn from fossil, and consider metadata such as issues and surrounding discussions to be a part of the repo.

    A really robust solution would also be aware of dependencies and build systems, and even deployment: I see these as all fundamentally related, and connected to versioning in a way that should be reflected and tracked through software.

    • If you look into Bazel (build system), you start getting to the point where everything including dependencies, build system, and deployments can be defined as "source" code and ideally should be treated as a first class software

  • Cloud based code environments are starting to merge this. Github Code Spaces for one are starting this. I don't know if they use Operational Transaction (OT) or Conflict-Free Replicated Data Types (CRDT) but they are repo backed. I assume it is just using Github diffing tools in the repos and maybe OT/CRDT in live sessions over WebRTC or similar.

    Much of real-time collaboration goes back to networking and real-time networking used in distributed multi-user systems like games, where simulations need to sync on a server. In games though, Dead Reckoning [2] is used as well as interpolation and extrapolation in prediction, much of it can be slightly different for instance with physics/effect, but messages that are important to all like scores or game start/end are reliably synced and determined on the server.

    [1] https://visualstudio.microsoft.com/services/github-codespace...

    [2] https://www.gamasutra.com/view/feature/131638/dead_reckoning...

  • Author of the blog post here. I totally agree with you.

    People think of OT / CRDT as realtime algorithms for realtime collaborative editing because they're always programmed and used that way. But the conflict resolution approach doesn't have to merge everything as-is. You could build a CRDT or OT system that generated VCS-style conflicts if concurrent edits happen on the same line of code. To make it a valid OT / CRDT algorithm the main constraint is just that every peer needs to resolve conflicts the same way. (So if I merge your changes or you merge my changes, we end up with identical document states). It would be easier to implement using OT because you only have to consider the interaction between two peers. But I think its definitely doable in a CRDT as well.

    I think having something that seamlessly worked in both pair programming setups and with git style feature branches & merging would be fantastic.

    I have a lot of thoughts about this and would be happy to talk more about it with folks in this space.

    • I've approached this problem from a different angle. I thought one could embrace the possibility of several truths. In my solution a truth lives in what I call a layer. Different layers can then be projected together in different stacks. Instead of using feature branches/toggles one can change the stack by adding/removing/reordering layers. One can also handle localized content this way, which was the original use case before the idea mutated.

      I also thought one could differ between two different kinds of conflicts. I call them simultaneities and disagreements. Simultaneities happen when concurrent changes happen and could be handled by CRDTS for example. Disagreements are conflicts between layers.

      The idea is then that you can choose to work in the same layer as someone else, if you are working close. You can also "shield" yourself from changes in other layers by working in a different layer. If you want to detect a disagreement between your layer and a layer someone else is working on, you project your layer on top of that layer.

      Even though I believe in these ideas I don't know how to get other people interested in them. It might be that they are flawed in an obvious way not apparent to me.

      What would it take to make someone of your caliber curious?

      1 reply →

  • My understanding may be flawed, but as far as I know you can think of an OT log and a git log as being similar. Each party generates deltas to the data structure that are recorded in the log, and when these parallel histories meet they must be merged. OT merges without involving the user, which sometimes leads it to discard changes. Git merges like that if it can, but when something must be discarded it asks the user. It is the interactive merging and deep ability to navigate and edit the log of changes that makes git so command-liney.

    • Not intending to nit-pick, but Git doesn't store the content as deltas. Each commit is the snapshot of the entirety of the codebase at that point in time.

      3 replies →

  • line-oriented data formats vs everything else. Why ? Because of "patching theory". If you don't understand the the data describes objects and doesn't have line-by-line semantics, it is hard to get merges correct.

    Version control works wonders with line-oriented stuff, which covers more or less every programming language in existence.

    It doesn't do so well with non-line-oriented structured formats such as XML (not sure how JSON or TOML) fits in here).

    Given that collaborative editing typically works with non-line-oriented data formats, you can see the issue, I think.

    • That's what I refer to as "grammar-aware diffing" in the sibling comment, and it's one of the low-hanging fruits here.

      Even git allows for pluggable diffing, and doesn't force line orientation. What's missing is the concept of moving something, as distinct from deleting lines/chunks and then inserting lines/chunks which just happen to be the same.

      This is not a problem which CRDTs have, to put it mildly. I believe pijul understands it as well. A lot of this stuff is right out on the cutting edge, and as it matures it will become practical to connect the edges, such as a CRDT which collaborates with a parser to produce grammar-aware patches which are automagically fed to pijul or something like it.

      This comes with a host of problems, mostly that we're not used to dealing with a history which has this level of granularity, most of which we don't want to see, most of the time. But they would be nice problems to have.

      1 reply →

If you're interested in building collaborative apps but not the architectural overhead of implementing CRDTs I'd recommend checking out roomservice.dev [1]. They've begun to power some other collaborative apps such as Tella.tv [2] - realtime browser-based video editing.

[1] https://roomservice.dev [2] https://news.ycombinator.com/item?id=24158509

Disclaimer: I've invested in roomservice.dev (and very excited about what they're building!). No affiliation with Tella.

I have been consistently at odds with myself comparing CRDTs vs. OT. One the one hand, CRDTs have a nicer core formalism. On the other hand, OT works, and is closer to the actual driving events of text editing.

The core argument of this article: that CRDTs now work and distributed is better than centralized I question. I certainly want more distribution than "everything is run on a google server" but do I really foresee a need for distributing a single document? One server with an optimal OT implementation can probably handle near a million active connections.

In practice, that's plenty. Each piece of data having one owner is quite reasonable. There are lots of pieces of data.

I remain on the fence for collaborative text editing. Though it's great to see all the work pushing CRDTs forward!

  • Blog author here. I've been having this conversation with a lot of folks over the last few weeks and I hear you.

    Does it make sense for us as an opensource community to invest our time and energy making one really good CRDT (with implementations in a few languages). Or does it make sense for us to distribute that energy between a bunch of CRDT and OT implementations, with different performance tradeoffs?

    My take is that its been hugely beneficial to us all that JSON is a standard, because I can use it from every language, and have confidence that the implementations are fast and good quality. I think we have an opportunity to make that for a good CRDT too. Even if OT would work fine in your architecture, if we have a great, capable, fast CRDT kicking around, it could be a reasonable default for most people. And my claim is that the performance difference between CRDTs and OT is smaller than the difference between high and low quality implementations. (I expect a well written CRDT in wasm will outperform my OT code in javascript.)

>>> Philosophically, if I modify a google doc my computer is asking Google for permission to edit the file. (You can tell because if google’s servers say no, I lose my changes.) In comparison, if I git push to github, I’m only notifying github about the change to my code. My repository is mine. I own all the bits, and all the hardware that houses them. This is how I want all my software to work.

I also long for this future. I want all software to work in this manner.

Interesting to know how CRDTs are enabling this, or how the limitations of OTs have restricted development of these tools (although git and GitHub exist regardless)

If you don't use CRDTs, you may be doomed to re-invent them. Reading about them just now I realized that I spent the last year developing a CRDT with LWW and OR characteristics.

edit: updated 'you are doomed' to 'you may be doomed'.

When dealing with this type of discussion I always try to remember that making design decisions is a tradeoff, an arbitrage highly dependent on your knowledge of the field, but also context and taste.

Believing there is a silver bullet is a fool errand.

From what I've read about CRDTs, it seems difficult to escape the overengineering trap when dealing with them.

  • I tend to agree. Each team, project, and organization has different needs, preferences, and cultures. One-size-fits-all is a really tall order.

    I believe it's better to focus on kits of parts--API's and/or self-contained functions--that can be combined or ignored as needed, along with a variety of reference application samples.

    Having lots of ways to easily filter and sort content is also very useful. For example, filtering and/or sorting annotations by person, group, date, content (sub-strings) is very useful. A query-by-example kind of interface is nice for this.

I hate that I am skeptical on this. I suspect wave just left that bad of a taste behind. So much hubris in what was claimed to be possible.

The ideas do look nice. And I suspect it has gotten farther than I give credit. However, sequencing the edits of independent actors is likely not something you will solve with a data structure.

Take the example of a doc getting overwhelmed. Let's say you can make it so that you don't have a server to coordinate. Is it realistic to think hundreds of people can edit a document in real time at the same time and come up with something coherent?

Best I can currently imagine is it works if they are editing hundreds of pages. But, that is back to the basic wiki structure working fine.

So, help me fix my imagination. Why is this the future?

  • In the case of a text document, concurrent edits form branches of a tree in many string CRDTs: http://archagon.net/blog/2018/03/24/data-laced-with-history/

    So yes, hundreds of people can edit a string and produce a coherent result at the end. Contiguous runs of characters will stick together and interleave with concurrent edits.

    • CRDTs don't guarantee coherence, but instead guarantee consistency.

      The result may often be coherent at the sentence level if the edits are normal human edits, but often will not be at the whole-document level.

      For a simplistic example, if one person changes a frequently-used term throughout the document, and another person uses the old term in a bunch of places when writing new content, the document will be semantically inconsistent, even though all users made semantically consistent changes and are now seeing the same eventually-consistent document.

      For a contrived example of local inconsistency, consider the phrase "James had a bass on his wall." Alice rewrites this to "James had a bass on his wall, a trophy from his fishing trip last summer," and Brianna separately chooses "James, being musically inclined, had hung his favorite bass on his wall." The CRDT dutifully applies both edits, and resolves this as: "James, being musically inclined, had hung his favorite bass on his wall, a trophy from his fishing trip last summer."

      In nearly any system, semantic data is not completely represented by any available data model. Any automatic conflict-resolution model, no matter how smart, can lead to semantically-nonsensical merges.

      CRDTs are very very cool. Too often, though, people think that they can substitute for manual review and conflict resolution.

      3 replies →

    • I upvote for the link alone. This article (data-laced-with-history) is the best source if you are starting your journey into CRDTs.

    • What if the document starts empty and syncing doesn't happen until everyone presses submit? Will it CRDTs produce a valid document? Yes. Will it make any sense? Who knows. I think that's what OP is getting at.

      4 replies →

  • > Is it realistic to think hundreds of people can edit a document in real time at the same time and come up with something coherent?

    And here's the thing: Can 100 people edit a document, even in theory, and have it make sense? I think the answer is "no," with or without technology.

    I'm sure there are other uses for these data structures, but shared editing is always the example I read about.

    • Ultimately I think the answer is “it depends” but the issue is that there is usually document structure which is mot visible in the data structure itself. For example imagine getting 100 people to fill out a row on a spreadsheet about their preferences for some things or their availability on certain dates. If each person simultaneously tries to fill in the third row of the spreadsheet (after the headings and the author), then a spreadsheet CRDT probably would suck at merging the edits. But if you had a CRDT for the underlying structure of this specific document you could probably merge the changes (eg sort the set of rows alphabetically by name and do something else if multiple documents have rows keyed by the same name).

    • It depends how big the document is, i.e. what is the density of users per page. If it's a 100 page document and the 100 users are all working on different sections, then it could easily be possible.

      I just don't remotely see a use case for this. Real-time human collaboration in general fails at a scale much smaller than this, and not because of the tools available.

    • A question I always have is if CDRTs solve some problem with collaborative editing, then can git's merge algorithm be rewritten to use CDRTs and benefit from it somehow?

      Somehow I think the answer is no. There is a reason we still have to manually drop down to a diff editor to resolve certain kinds of conflicts after many decades.

      5 replies →

    • Depends on what kind of document we’re talking about I.e. how the grammar captures the domain model. Eg: A shared ledger in the case of digital currencies, or the linux source code being worked on remotely by many people are exactly examples of such documents.

    • Maybe if your "document" is the Encyclopedia Britannica? Wikipedia has hundreds of editors working at once, but that only really works because it's broken up into millions of smaller parts that don't interact much.

    • I meant this to be my takeaway. The data structure is nice. And I suspect it is a perfect fit for some use cases. I question the use case of shared editing. Not just the solution, but the use case.

  • Your key insight, which is spot-on, is that nothing can prevent human-level editing conflicts.

    If I was going to take an attempt at justifying the importance of CRDTs, I would say:

    CRDTs are the future because they solve digital document-level conflict.

    They don't bypass the problem the way that diff/patch/git conflict resolution does, by requiring human intervention.

    Instead they truly and utterly obliterate the digital conflict resolution problem: a group of people editing a document can separately lose network connectivity, use different network transports, reconvene as a subgroup of the original editors... and their collective edits will always be resolved automatically by software into a deterministic document that fits within the original schema.

    If viable, this has far-reaching implications, particularly related to cloud-based document and sharing systems.

    • But how do they obliterate it? They just move the authority, no?

      That is, say you get a hundred machines editing a document. They split into partitions for a time and eventually reunite to a single one. What sort of coherent and usable data will they make? Without basically electing a leader to reject branches of the edits, sending them back to the machines rejected?

      3 replies →

  • > sequencing the edits of independent actors is likely not something you will solve with a data structure.

    Any multiplayer game does this. Git does this as well.

    So of course you can do this, it's a matter of how you reconcile conflicts. Real-time interactive games will generally choose a FIFO ordering based on what came into the server's NIC first. Git makes the person pushing the merge reconcile first.

    For docs, live editing seems to work the same as in games. Reconciliation for the decentralized workflow will be interesting, but it's just going to be minimizing the hit to a user when their version loses the argument.

    • But git doesn't do this. It punts to the user pretty quickly. (Even when it performs a merge, it is expected that the user confirmed the merge with a build. That is, git just claims the states combined without stopping in the same lines of the same files. There merge has to be verified by a user, from its perspective.)

      Games, similarly, typically have a master state server. Yes, they optimistically show some state locally, but the entire process checkpoints with a central state constantly. (Else you get jerky behaviors pretty quickly as the states diverge more and more in ways that can't be reconciled without committing to a branch over another.)

      Edit: that is, I would think the point is to force more, smaller arguments. Anything else puts more at risk as you lose one. Right?

      3 replies →

  • "Twitch plays Google Docs" is always going to be incoherent, for social reasons. CRDTs can make it possible, they can't make it a good idea.

    But for a contrived example, a game with hundreds of players, backed by an enormous JSON document, where the game engine is in charge of making sure each move makes sense: A CRDT could enable that, and each player could save a snapshot of the game state as a simple text file, or save the entire history as the whole CRDT.

    Or as a less contrived example, instead of a game, it's a chat client, and it provides rich text a la Matrix, but there's no server, it's all resolved with CRDTs and all data is kept client-local for each client.

    There are a lot of cool things you can build with a performant CRDT.

No, peer2peer lockstep is the future. No central server, no speed penalty. No storage penalty.

Has been used in RTS games to synchronize 1000s of units across low-bandwidth connections.

Input may be delayed by latency which can be mitigated with client-side prediction. Cosmic bit-shifts & indeterminism can be a challenge in longer sessions but peers can sync with eachother when there is an OOS.

  • Usually in games you have some sort of mechanism for determining what is the 'truth' in terms of game state. I agree that if everyone is online while editing or only briefly offline then what you suggest would probably be much better. If someone was offline for long periods of time and made extensive edits they would essentially have to be discarded.

    I think in practice what you would do (if your use case allowed it) is use CRDTs, but periodically checkpoint and trim them when you know everyone has synced. That gives you very similar properties to the video game world and still has the features of not losing peoples edits when they make them offline.

The title sounds like it could be fanboy clickbait but it’s actually a thoughtful look at how far CRDTs have come from the viewpoint of an expert and skeptic.

A good read.

It is wonderful to see so much enthusiasm about this technology. I have been working on CRDTs since 2012 and it has been quite a ride.

For those looking for more information, have a look at the information collected at http://crdt.tech/ (Disclaimer: I am involved, though Martin did the bulk load of the work.)

If you are into CRDTs for collaborative gaming, we are looking for partners and investors: https://concordant.io (Disclaimer: I am technical advisor in its team.)

Doesn't Redis implement CRDT's in production?

https://redislabs.com/blog/diving-into-crdts/

  • Yes, as does riak. There are plenty of simple crdts and the theory, while recent, has all of it's fundamentals fleshed out. We know what property makes data structures crdts, and how to compose them, and how to prove they are crdts.

    Currently we are in the "discovery of new crdts" and "engineering and implementing of older crdts reliably" phase, and in some cases "discovering when not to use crdts".

    The crux of the this issue is that crdts that play nice with human expectations in regards to collaborative document editing are not known, possibly excepting automerge (yjs). As it's a 'softer' concept will no good axioms, there is no solid theory on how to combine the theoretical requirements of crdts with human expectations.

  • It looks like it's basically biasing in favor of some operations over others. In the link they talk about CRDT sets, saying at some point:

    > 1. Adding wins over deleting.

    yeah, so, _maybe_ you can remove elements from your set. If you're lucky. I dunno about all that...

    • That's an overly pessimistic way to put it.

      I think it's more accurate to say that maybe you can remove elements from your set... unless another actor wants them in the set.

      That's not always the behavior you want. But if it is, it's great.

CRDTs are hip and cool. But right now I'm trying to find an implementation for desktop software, not some web-framework in-electron. And could not find a concise and correct codebase.

All the implementations are: 1. javascript or 2. dependent on their chosen method of synchronisation or 3. incorrect.

The result of a two week long search is that I'm reimplementing the stuff myself...

  • https://github.com/automerge/automerge-rs

    I can't speak to its usability as I'm waiting on a 1.0

    • Im one of the authors of this. Right now the code is very unstable as we're tracking the performance branch of the JS implementation. Once the JS version hits 1.0 I'll be putting a bunch of effort into making the API cleaner and more rusty and documenting things.

      It does work and can actually be used as a backend for the JS implementation if you use the wasm backend we've built. In fact, this is how we have tested it, by compiling to WASM and running the JS test script against it.

I'm working on a project with some offline data synchronization needs, but haven't started implementation yet. I've been following CRDTs with interest. I also saw many of the same downsides mentioned in the OP, e.g. bloat (which apparently are being addressed remarkably well). Beyond OT, another approach I've run across that looks very promising is Differential Synchronization[1] by Neil Fraser. While it also relies on a centralized server, it allows for servers to be chained in such a way that seems to address many of the downsides of OT. I wonder why I rarely ever see Differential Synchronization mentioned here on HN? Is it due to lack of awareness or because of use-case fit issues or some fatal flaw I haven't seen? Or something else?

[1] https://www.youtube.com/watch?v=S2Hp_1jqpY8

Has anybody seen any work where CRDTs get insight into conflict resolution using the underlying grammar of whatever text is being written (aka English, Javascript, regex, etc.)? Seems like the conflict resolution could do a better job if it knew the EBNF of the text that was being edited.

Also, any prior art on CRDTs over relational data? I suppose each single field would potentially be an "editable space", but on a long document (say a Javascript code block), updating the whole thing in the db by overwriting it with each edit would not be very efficient. Seems like there could be a datatype that was a little smarter than "text", that could handle more granular updates, that implemented a CRDT index under the hood? I'm working on a VCS for relational data [1] which is more on the not-so-realtime-collaborative-editing end of the spectrum, but would really like to figure out how to handle real-time collaborative editing in this space.

Maybe over WebRTC? I found a little bit of prior art w.r.t. CRDT over WebRTC. [2]

[1]: https://github.com/aquametalabs/aquameta/tree/master/src/pg-... [2]: https://github.com/mattkrick/rich

CRDT = Conflict-Free Replicated Data Types. Think git for data structures instead of directory trees.

Ahhhh Google Wave. I was an early adopter and shed a tear when it went away. The closest I've felt to that product is Slack but find Slack too noisy. With Wave I felt like I was IN my work not in a "sidebar" application that was pulling my attention from my work. I suppose there were so many ways to use Wave and so many ways to use Slack that your experience could be completely different than mine. But RIP Google Wave.

CRDTs seem very promising, but we still have a long way to go. The most exciting work in this area is being done by Ink&Switch [0]. They have a number of interesting real-world app prototypes based on CRDTs.

- An interesting case where CRDTs failed is Xi-editor, where they tried to use CRDTs as the basis for a plugin system [1,2].

- One of the biggest problems with CRDTs is the overhead needed to keep track of the full document history. The automerge [3] project has been working on efficient compression of CRDTs for JSON datatypes.

- The idea of monotonic updates is really appealing at first, but I was disappointed when I realized there's no good solution to handle deletions. Tombstones, to me, seem like kind of a hack, albeit a necessary one. Practically, CRDTs aren't the silver bullet they might seem like at first.

- Another lesson learned is that when ten people are editing the same paragraph, there's not really a right answer. I think the key to implementing CRDTs is doing it at the correct level of granularity.

- ProseMirror intentionally chose NOT to use CRDTs [4].

- Some more good references are [5,6,7]

[0] https://abishov.com/xi-editor/docs/crdt-details.html

I recently blogged about using CRDTs in building privacy focused software, e.g. how would one approach building an end-to-end encrypted Google Docs: https://www.kn8.lt/blog/building-privacy-focused-collaborati...

The nice thing about CRDTs is that each individual message can be end-to-end encrypted (like WhatsApp messages), and then re-merged by all the clients locally.

A local-first database with such an encrypted sync property would would be amazing for building lots of apps with the ability to sync data between users or between your devices seamlessly. The challenge I ran into my initial experiments is that CRDTs need to be compacted/merged in various ways to stay efficient, but encryption gets in the way of that a little when considering server backups / high availability.

CRDTs just are.

Following the golden rule, I always post a link to a series of papers comparing the theoretical properties of CRDTs and OT – here's the latest one:

Real Differences between OT and CRDT under a General Transformation Framework for Consistency Maintenance in Co-Editors

Proceedings of the ACM on Human-Computer Interaction 2020

Chengzheng Sun, David Sun, Agustina Ng, Weiwei Cai, Bryden Cho

It’s an evolutionary series, here’s the rest I believe: https://arxiv.org/search/cs?query=Sun%2C+Chengzheng&searchty...

I wonder why OT is restricted to a central server. In 2016/2017 I wrote a Progressive Web App (PWA) for myself which uses an algorithm which probably fits the category of OT. It uses a WebDAV server for synchronization between devices. Yes, this is a centralized server, but when some super slow & dumb WebDAV server can serve this purpose, it should probably be possible to build it on top of S3, a blockchain or something federated.

My biggest issues at the time were around CORS as with a PWA you can't simply use every server the user enters, as the same-origin-policy keeps getting in your way.

  • OT isn't restricted to one server.

    Any time you see mention of "a server", you can in fact replace it with "a synchronised collection of servers acting as one".

    OT involves logic on the server, not just storage, so it's not really OT as generally meant, if using S3 and a collection of peer clients only running client logic. S3 doesn't have enough interesting logic for that.

    However, if you try the thought experiment of stretching "a synchronised collection of servers" to be all of the peers, no S3 even required, and then do OT with that, you can!

    The result behaves exactly like OT in terms of things like document editing, conflict resolution and history garbage collection, rather than behaving like a CRDT.

    It has different timing and reliability characteristics from single-server OT though. Those characteristics depend on how the virtual server is implemented on top of the synchronised peers, and especially how it implements peers coming and going.

    If that sounds like OT-on-p2p has similarities to CRDT-on-p2p - they do, and they are not the same. Roughly speaking, CRDT-on-p2p has lower latency relaying updates between interested peers, because it doesn't need to globally synchronise. However with some fancy techniques you can make OT-on-p2p fast most of the time as well, and retain some of the OT benefits.

    Those two behave differently but there are some common characteristics. Once you have the cluster idea, it's not out of the question to mix and match bits of OT, CRDT and other kinds of Distributed Transaction on a per-operation basis for different ops on the same data, depending on the characteristics you want them to have.

    There are many trade-offs in the characteristics.

    If you squint a lot, that's sort of, kind of, what realtime network games do implicitly, without a clear theory underlying it. They also add predictions and interpolations.

    • > OT involves logic on the server

      Why? What my app is doing is quite simple:

      1. Every time the user changes something it writes the change to a journal and

      2. executes the change on a local cache (to update the UI).

      3. Then it starts a background sync by fetching the latest version from the server

      4. executes the change on the fresh data from the server

      5. uploads the transformed data with an etag to avoid overwriting parallel changes from other clients and

      6. removes the change from the journal (and updates the local cache) if everything worked just fine.

      So you could argue that using the etag is some kind of logic, but I think that is not what you mean with 'involves logic on the server'.

      This implementation certainly doesn't work for all use-cases (e.g. high throughput/low latency), but given that it enables even offline-scenarios, I think it isn't that bad either.

A problem w/ e.g. CRDT datasync in web apps is data security, HTTP resources impose control points where you know "why" the client is asking for e.g. this chunk of social graph, it's /profile/friendlist so the UI can ask for a very controlled and tightly specified data projection for that particular UI and consumed by tightly controlled javascript. Datasync is NOT for scraper bots, arbitrary read patterns or any notion of general access.

Immutability makes data control way harder ...

When sick, listen to the doctor. Don't listen to the carpenter.

IMO, when looking for predictions of the future, listen to the ones that have a track record of accurate predictions.

  • That leaves you vulnerable to selection bias, of course... the best example (IMHO) being the Wall Street "binary search" stock prediction scam.

>I’m no longer convinced OT - and all the work I’ve done on it - will still be around. I feel really sad about that.

Don't be sad. What you did was a huge inspiration to me and I think a lot of people. It showed the possibilities even if it wasn't "optimal".

Commenting here in hope you'll see it. I skim to read most things, I only completed reading this article because I noticed 'I worked on Google Wave'.

Operational Transformation and Conflict-Free Replicated Datatypes are very different from each other.

As the author explains, OT relies on some ordering of system events, and CRDTs don't. That means CRDTs need to be commutative (and probably associative), and OT doesn't.

So, OT is less scalable but more powerful, and CRDTs are more scalable but less powerful (in theory).

It's sort of like comparing Paxos/Raft to Bittorrent.

(I am not an expert on OT.)

Great summary. CRDTs are a better fit for generalized data. Having previously worked on an OT system, the central server stickiness and merge complexity simply did not scale. There are trade-offs with CRDTs, especially metadata, but as the post mentions compression techniques are far more solvable in real-world scenarios than a fundamental performance bottleneck at the core.

Collaboration is the killer app of the next gen of software.

What we need is a community set of peer relay nodes, on top of which data structures can be synced. Infra companies are well setup to provide this firebase like store layer on top, but for any generic data structures (lists, dicts, array, etc)

With this, any saas application is at a disadvantage, because data is no longer tied to the application!

So CRDTs are the future, but what about today for real, production products? I'm just about to really dive into collaborative editing features for our product, and OT still seems to me to be a much safer bet unless you're dealing with a more obscure environment.

  • Yes, starting with OT looks easy. You can make 99% work in almost no time. But the last 1% will bite you in the rear really hard...

    Actually, CRDT is not a single data structure or even algorithm. It is a term for several families of data structures and different algorithms on them. If your task is not editing text, you may find a simple and already implemented CRDT for your case.

[CRDTs] would let us write software that treats users as digital citizens, not as digital serfs

Amen, brother.

"It was a general purpose medium (like paper). Unlike a lot of other tools, it doesn’t force you into its own workflow. You could use it to do anything from plan holidays, make a wiki, play D&D with your friends, schedule a meeting, etc."

So, sort of like email ?

I'm looking for a solution to implement collaborative editing in my visual programming node editor. Are CRDTs useful in this case?

Please include the expanded version of your acronym in post titles. Not everyone knows what you're talking about.

Google Wave is such a bitter sweet memory for me still. Such a great vision, such a great team. Such a sad company.

CRDT stands for conflict-free replicated data type.

  • Thanks, I had to look it up as well. It's not the first article I read on CRDTs but I definitely didn't recall what they were from just the acronym.

For all those wondering. CRDTs appear to be unrelated to some revival in Cathode Ray Tubes.

It takes over 800 words, and several mentions of the CRDT acronym, before the acronym is expanded for the reader.

Don't write like this. Respect your readers and help them comprehend. Expand acronyms as early as you can, ideally at the first mention.

  • The trouble with comments like this is that they make discussions shallower and more generic, which makes for worse threads [1]. Actually it's not so much the comment as the upvotes, but shallow-generic-indignant comments routinely attract upvotes, so alas it amounts to the same thing.

    The most recent guideline we added says: "Please don't complain about website formatting, back-button breakage, and similar annoyances. They're too common to be interesting." I suppose that complaints about writing style fall under the same umbrella.

    Not that these things don't matter (when helping people with their pieces for HN I always tell them to define jargon at point of introduction), but they matter much less than the overall specific topic and much less than the attention they end up getting. So they're basically like weeds that grow and choke out the flowers.

    (This is not a personal criticism—of course you didn't mean to have this effect.)

    https://news.ycombinator.com/newsguidelines.html

    [1] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...

  • I don't understand why you think that writing on a very technical subject needs to build you a ladder to climb on as a prerequisite. There is a link to a very high quality talk right at the top of the article for folks who wanted to dive deeper that specifically makes that effort.

    I found the article quite good, and if you had genuinely been motivated to engage with the content you could have highlighted the acronym and searched for it. There is a wealth of good info for "CRDTs" that comes up on the first page of Google, Bing or DDG.

    Does the acronym actually illuminate what they are or how they function? I submit to you that it probably doesn't.

    • There are practices for informative writing that have been developed over decades and decades that recommend, among other things, defining initialisms on first use.

      1 reply →

  • As I tirelessly mention whenever this comes up on HN, which is often: we have a specific technology that is designed precisely for this situation.

    It's called the link. All an author has to do is link the first instance of an acronym or piece of jargon to some authoritative description, and you get the best of both worlds: readers familiar with e.g. CRDTs[0] can just keep reading, and the rest can click the link and find out.

    [0]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...

    • In defense, the first sentence links to a Youtube video that expands the acronym in the first 10 seconds.

      The video also gives good context for the article, even for a beginner to the topic.

    • We also have select-contextmenu-search on both desktop and mobile, for any word or acronym. Links are nice for disambiguation or to point to a recommended resource, but they're hardly essential, nor are in-line expansions or definitions.

    • That's still lazy writing. Every blog should be written with the assumption it will be encountered by a non-specialist. Expanding abbreviations on first use and offering a brief explanation of jargon is enough to let these readers know if the article is something they are interested in.

      8 replies →

  • While I agree that reading the title was confusing ( as I am not familiar with CRDT ), I think the writing style was actually very good.

    I read the title, wondered what CRDT was, and started reading. In the back of my mind I was wondering what CRDT was, but reading the article felt like I was going on a journey. Every term that needed to be defined was defined. Finally, when CRDT was mentioned in the article, it was immediately defined.

    I generally agree that throwing acronyms around without defining them is not fair to the reader, but I don't think this article did that at all.

    • Yup, strong agree. The article did a great job of capturing the "story" of the competing approaches really well, I didn't even mind that the acronym wasn't explained until later.

    • This is called "burying the lede", where the newsworthy portion is buried somewhere later instead of being mentioned upfront. It's best not to do this, since not all readers will read two thirds of a story in order to determine the subject.

      4 replies →

  • Seriously. It needs to be explained the FIRST TIME it appears, and it shouldn't be abbreviated in the title. I read for a minute thinking he was talking about Chrome Remote Desktop (which is what CRDT means to me).

    Mods, can we expand the acronym in the title of this submission please?

  • > Don't write like this. Respect your readers

    This is way over the top.

    I thought the author did an amazing job of discussing a highly technical topic in a very approachable way. Every blog on HN should aspire to write like this! It was so good it got me reading other posts even.

    Yes, it would have been nice for us non- domain experts if the author had done the classic "Conflict-free replicated data type (CRDT)" thing, but you can easily just say that, ya know? "Hey, it would be helpful if you expanded CRDT early on."

  • Agreed, 29 mentions of the acronym 'CRDT' and I had no idea what it was until I had to break my reading flow and google it, it sounded like buzzword soup to me.

    Engineers, when talking about technical concepts with acronyms, always expand them for the first time to your readers!

  • I’m sure a casual, non-technical reader of Hacker News would be unaware of most of the headlines here. Google is your friend, and CRDTs are part of the language of distributed systems. To some degree, one has to help themselves.

  • Given that the author defines CRDT (conflict-free replicated data type) a few paragraphs in, it might have been accidental. The author might have re-ordered a few of the paragraphs during editing.

  • You can also be even nicer and have the first expansion of the acronym link to a wikipedia page or other relevant explanation.

  • I strongly disagree, that forces author to spend extra time on explaining everything. That's why it's often so hard for me to find quality in-depth advanced blogs on various technologies and fields -- because they all tend to be really introductory. So there's either papers or tutorials, but nothing in-between. E.g. a different-angle explanation of the same thing, or comparison with another tech who came from that.

    In contrast, I like way more a different approach on explaining (mostly see it on Cyrillic forums) -- instead of guiding you by hand, they just give you clues where to look for. That way, knowledge givers are way more approachable, because it costs them very little to chat back something like "look for CRDT", than go into in-depth explaining. In the end -- there's way more information, and from top experts in the fields.

  • Author here. Thanks for the feedback - I’ll update the article.

    I’m a little embarrassed to admit I didn’t even notice.

    • I'm of the camp that focusing on the acronym stuff is missing the point: that was a thoughtful, well-written piece.

      I, for one, am grateful that you took the time to write it.

      2 replies →

  • He wasn't writing for you, he was obviously writing for people familiar with these algorithms.

  • Is it really that hard to google? If you're trying to learn about a subject it can get annoying to repeatedly have to jump to the meat of the article or fast forward if you're watching a video.

    • Obviously googling isn't hard, but having to google what could be easily explained in the text breaks one's concentration, something that is critical for most readers.

      2 replies →

    • Is it really that hard to spell it out and then put the abbreviation in parenthesis the first time it is used?

  • I was at least happy that the wiki detour introduced me to "gossip protocols" which is probably now one of my all-time favourite technology namings.

  • I literally closed the article after reading the first blurb, because it wasn't explained. Just started googling.