Comment by zemo
5 years ago
in real-time? Well, I have thoughts, but I'm not super familiar with Ardour itself, so I'm not sure if you're trying to merge during a live performance or if you're talking more of a distributed studio recording session type situation. I have working knowledge of Reason and Logic and ChucK (which I use with JACK and do some networked OSC stuff with, although I haven't touched it in a few years).
The approach we use at Jackbox for making the state of an xbox game mutable to thousands of live viewers on twitch is to have lots of little CRDT values, mostly just counters and sets of strings, and you merge the little values independently of one another, which is very different from the situation of editing a text document, which is typically structured as one big value. I wonder if, for a DAW, you could merge at the track or control level instead of the workspace level. E.g., communicate as an independent value the state of an individual fader, and communicate either states or operations on that fader and have each client merge them. In this example, the fader's state would be encoded as a PN-counter with a range condition, and you'd replicate increment and decrement options, like it was a networked rotary encoder. So every mutable thing in the DAW would be a value having operations that can be merged, instead of having a single big value representing the entire state of the DAW. My use-case is also funky because I have potentially thousands of writers writing to the same key, but only a single reader, and the reader doesn't need an update at every change, so I use state-based CRDTs, but I think most other people using CRDTs use operation-based CRDTs. Also not sure how you would mutate two separate values transactionally or if that's a thing you even need.
Not realtime. Users would sync periodically during their working process.
There are lots of mutable things in a DAW that are not numeric parameters.
The state of a playlist (just think "some time ordered list of objects") is not treatable in the same way as the value of a fader.
If you had a context-aware XML parser and access to timestamps for every XML node, you could do the human-aided merge by considering each node and just using the latest version of the same node, falling back to the human when there's a deletion conflict (for example). But this doesn't actually merge the attributes of a node or deal with conflicts between otherwise sequential edits.