← Back to context

Comment by wbillingsley

1 day ago

What I used to recommend to my sofware engineering classes is that instead of putting large files (media etc) into Git, put them into the artifact repository (Artifactory or something like it). That lets you for instance publish it as a snapshot dependency that the build system will automatically fetch for you, but control how much history of it you keep and only require your colleagues to fetch the latest version. Even better, a simple clean of their build system cache will free up the space used by old versions on their machines.

People like storing everything in git because it significantly simplifies configuration management. A build can be cleanly linked to a git hash instead of being a hash and a bunch of artifacts versions especially if you vendor your dependencies in your source control and completely stop using an artifact repository.

With a good build system using a shared cache, it makes for a very pleasant development environment.

It sounds like a submodule... But certainly if the problem could be solved with a submodule, people would have found out long ago. Git's submodules also support shallow-cloning already [1]. I can only guess what the issues are with large files since I didn't face it myself - I deal with pure source code most of the times. I'm interested to know why it would be a bad idea to do that, just in case. The caveats pointed out in the second SO answer don't seem to be a big deal.

[1] https://stackoverflow.com/questions/2144406/how-to-make-shal...

  • It sounds different to me - a regular git submodule would keep all history, unlike a file storage with occasional snapshotting.

Do you teach CI/CD systems architecture in your classes? Because I am finding that is what the junior engineers that we have hired seem to be missing.

Tying it all in with GitLab, Artifactory, CodeSonar, Anchore etc

This has its own issues. Now you need to provision additional credentials into your CI/CD and to your developers.

Commits become multi-step, as you need to first commit the artifacts to get their artifact IDs to put in the repo. You can automate that via git hooks, but then you're back at where you started: git-lfs.