Comment by ks2048
1 day ago
> This is a solved problem: Rsync does it.
Can you explain what the solution is? I don't mean the details of the rsync algorithm, but rather what it would like like from the users' perspective. What files are on your local filesystem when you do a "git clone"?
When you do a shallow clone, no files would be present. However when doing a full clone you’ll get a full copy of each version of each blob, and what is being suggested is treat each revision as an rsync operation upon the last. And the more times you muck with a file, which can happen a lot both with assets and if you check in your deps to get exact snapshotting of code, that’s a lot of big file churn.
The overwhelming majority of large assets (images, audio, video) will receive near-zero benefit from using the rsync algorithm. The formats generally have massive byte-level differences even after small “tweaks” to a file.
Video might be strictly out of scope for git, consider that not even youtube allows 'updating' a video.
This will sound absolutely insane, but maybe the source code for the video should be a script? Then the process of building produces a video which is a release artifact?
2 replies →
A lot of video editing includes splicing/deleting some footage, rather than full video rework. rsync, with its rolling hash approach, can work wonders for this use-case.