Comment by flohofwoe
19 hours ago
Even ancient svn works much better out of the box for large binary files than git (e.g. a 150 GB working directory with many big Photoshop files and other binary assets is no problem in SVN).
What does SVN do differently than git when it comes to large binary files, and why can't git use the same approach?
I also don't quite understand tbh how offloading large files to somewhere else would be fundamentally different than storing all files in one place except complicating everything? Storage is storage, how would a different storage location fix any of the current performance and robustness problems? Offloading just sounds like a solution for public git forges which don't want to deal with big files because it's too costly for them, but increased hosting cost is not the 'large binary file problem' of git.
(edit: apparently git supports proper locking(?) so I removed that section - ps: nvm it looks like the file locking feature is only in git-lfs)
Completely different design. Git is intended to be fully distributed, so (a) every repo is supposed to have the full history of every file, and (b) locking is meaningless.
People should use the VCS that's appropriate for their project rather than insist on git everywhere.
> People should use the VCS that's appropriate for their project rather than insist on git everywhere.
A lot of people don't seem to realise this. I work in game dev and SVN or Perforce are far far better than Git for source control in this space.
In AA game dev a checkout (not the complete history, not the source art files) can easily get to 300GB of binary data. This is really pushing Subversion to it's limits.
In AAA gamedev you are looking at a full checkout of the latest assets (not the complete history, not the source art files) of at least 1TB and 2TB is becoming more and more common. The whole repo can easily come in at 100 TB. At this scale Perforce is really the only game in town (and they know this and charge through the nose for it).
In the movie industry you can multiply AAA gamedev by ~10.
Git has no hope of working at this scale as much as I'd like it to.
Perforce gets the job done but it's a major reason why build tooling is worse in games.
Github/gitlab is miles ahead of anything you can get with Perforce. People are not just pushing for git because they ux of it, they're pushing git so they can use the ecosystem.
1 reply →
I've been thinking of using git filter to split the huge asset files (that are just internally a collection of assets bundler to 200M-1GB files) into smaller ones. That way when artist modifies one sub-asset in a huge file only the small change is recorded in history. There is an example filter for doing this with zip files.
The above should work. But does git support multiple filters for a file? For example first the above asset split filter and then store the files in LFS which is another filter.
1 reply →
> People should use the VCS that's appropriate for their project rather than insist on git everywhere.
Disagree. I really like the "de-facto standard" that git has become for open source. It means if I want to understand some new project's source code, there is one less hassle for me to deal with: I don't need to learn any new concepts just to access the source code and all the tooling is already right there.
The situation we have with package managers, dependency managers and package managers for package managers is worse enough. I really don't want a world in which every language or every project also comes with its own version control system and remote repo infrastructure.
A "proper" versioning system doesn't need to be learned since you literally only need a handful of straightforward operations (how do I get the latest version? how do I add a file? how do I commit changes?) - in svn that's 'svn update', 'svn add' and 'svn commit', that's all what's needed to get you through the day, no 'push', no 'staging area', no 'fetch' vs 'pull' and the inevitable merge-vs-rebase discussion... etc etc etc...)
It's only git which has this fractal feature set which requires expert knowledge to untangle.
3 replies →
To be clear, "fully distributed" also means "can use all of the features offline, and without needing commit access to the central repository".
I can't imagine living without that feature, but I also do a lot of OSS work so I'm probably biased.
How often are you fully offline though? A centralized version system could be updated to work in 'offline mode' by queueing pushed changes locally until internet is available again (and in SVN that would be quite trivial because it keeps the last known state of the repository in a local shadow copy anyway).
4 replies →
> People should use the VCS that's appropriate for their project rather than insist on git everywhere.
But git is likely to be appropriate almost everywhere. You won’t just use svn just for big file purposes while git is better for everything else in the same project
The thing is, in a game project, easily 99% of all version controlled data is in large binary files, and text files are by far the minority (at least by size). Yet still people try to use git for version control in game projects just because it is the popular option elsewhere and git is all they know.
4 replies →
> Git is intended to be fully distributed
Which is kinda funny because most people use git through Github or Gitlab, e.g. forcing git back into a centralized model ;)
> People should use the VCS that's appropriate for their project rather than insist on git everywhere.
Indeed, but I think that train has left long ago :)
I had to look it up after I wrote that paragraph about locking, but it looks like file locking is supported in Git (just weird that I need to press a button in the Gitlab UI to lock a file:
https://docs.gitlab.com/user/project/file_lock/
...and here it says it's only supported with git lfs (so still weird af):
https://docs.gitlab.com/topics/git/file_management/#file-loc...
...why not simply 'git lock [file]' and 'git push --locks' like it works with tags?
If you’re making local commits (or local branch, or local merge, etc), you’re leveraging the distributed nature of Git. With SVN all these actions required being done on the server. This meant if you were offline and were thinking of making a risky change you might want to back out of, there was no way on SVN to make an offline commit which you could revert to if you wanted.
Of course if you’re working with others you will want a central Git server you all synchronize local changes with. GitHub is just one of many server options.
3 replies →
I very much dislike git (mostly because software hasn't been "just source" for many decades before git was hobbled together, and git's binary handling is a huge pain for me), but what does a lockfile achieve that a branch -> merge doesn't, practically, and even conceptually?
8 replies →
You should look into how git is architected to support a lot of features SVN doesn't (distributed repos is a big one). When you clone a git repo you clone the full file history. This is trivial for text but can be extremely large for binary files.
Storage is not storage as you can store things as copies or diffs (and a million other ways). For code, diffs are efficient but for binaries, diffs approach double the size of the original files, simply sending/storing the full file is better.
These differences have big effects on how git operates and many design choices assumed diffable text assets only.
If you do a little research, there's plenty of information on the topic.