Comment by forrestthewoods

18 days ago

> On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

I don’t think that’s actually true?

How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

I’ve worked with some tools that can diff images. Works great. Not a problem in need of solving.

In any case I’ll give benefit of the doubt that this project solves some real problem in a useful way. I’m not sure what it is.

My goals in a VCS for binary files seem to be very very very different than yours.

I think our goals indeed differ.

> How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

If version control is embedded in an app, constantly.

Imagine a cell in a spreadsheet. An application wants to display a "blame" for a cell C43 i.e. how did the cell change over time?

The lix way is this SQL query

SELECT * from state_history WHERE file_id <the_spreadsheet> AND schema_key "excel_cell" AND entity_id C43;

Diffing on the fly is not possible. The information on what changed needs to be available without diffing. Otherwise, diffing an entire spreadsheet file for every commit on how cell C43 changed takes ages.