← Back to context

Comment by bobmcnamara

1 day ago

Most Microsoft office documents.

One of our projects has a UI editor with a 60MB file for nearly everything except images, and people work on different UI flows at the same time.

So for office, you're looking at files that are archive formats already. Maybe you could improve that a bit, but because of the compression you wouldn't be able to diff text edits better, just save storage. So it would perform about the same as git already does. And you could make it smarter so the prolly tree works better, but you could also make git smarter in the same way, it's not a prolly tree specific optimization.

For your UI editor I'd need to understand the format more.

  • I picture a next-gen SCM having storage plugins, possibly implemented in something portable like WASM or a portable bytecode. This would enable archive formats to be unpacked for differencing and then repacked for consumption.

    Alternatively, apps could detect and support SCM storage distinct from ordinary file storage.

    As a trivial example: Office documents are just zip files, so compatible apps could just save them without compression when saving to SCM-managed folders.

    Or better yet, compression, but restart it for each contained file to enable efficient binary diffs.

    • Zip does restart for each file. But that still means a lot of wasted storage when you change one word for example.