Comment by PaulDavisThe1st
5 years ago
line-oriented data formats vs everything else. Why ? Because of "patching theory". If you don't understand the the data describes objects and doesn't have line-by-line semantics, it is hard to get merges correct.
Version control works wonders with line-oriented stuff, which covers more or less every programming language in existence.
It doesn't do so well with non-line-oriented structured formats such as XML (not sure how JSON or TOML) fits in here).
Given that collaborative editing typically works with non-line-oriented data formats, you can see the issue, I think.
That's what I refer to as "grammar-aware diffing" in the sibling comment, and it's one of the low-hanging fruits here.
Even git allows for pluggable diffing, and doesn't force line orientation. What's missing is the concept of moving something, as distinct from deleting lines/chunks and then inserting lines/chunks which just happen to be the same.
This is not a problem which CRDTs have, to put it mildly. I believe pijul understands it as well. A lot of this stuff is right out on the cutting edge, and as it matures it will become practical to connect the edges, such as a CRDT which collaborates with a parser to produce grammar-aware patches which are automagically fed to pijul or something like it.
This comes with a host of problems, mostly that we're not used to dealing with a history which has this level of granularity, most of which we don't want to see, most of the time. But they would be nice problems to have.
Some of "We" depend on sub-line diff highlighting during code reviews in order to reason about refactors and adding/removing arguments from function signatures.
That this is generally a feature of the diff tool and not the version control is a bit disappointing.