Comment by nikanj
7 years ago
We pushed large binaries into our git in the past. This was fine-ish as long as Git was hosted inhouse, but now that it's SAASed out, they are a huge pain in the rear.
I've browsed through a few git guides, but can't seem to find anything that would let me:
1) Do something like "du -s *|sort -n" for the entire Git history
2) Let me "rm -rf --from-history-too", that would cause the remote repo to actually shrink in size.
I think you should be able to do something like this with
This won't be a terribly fun exercise, and could be very painful if your history contains a lot of merges. (should be easier with the cactus/rebase development model)
And of course everyone will have to hard-reset to the new branch.
I should mention I'm far from an expert on this. I've only ever used git filter branch on a handful of commits, and only based on examples provided by kind internet people. I certainly haven't done anything nearly as far-reaching as you're about to embark on.
Yeah, filter-branch is the way to go for this. Also to for example extract a folder into a new repository. I've used it in a few cases.
Early in the history of a repository, I committed some files with sensitive information. The only way to fix this (and similar problems) is to reconstruct the repos starting from the commit just before you committed the unwanted file(s).
I'm a bit of a git naif, there are doubtless better ways to do this. This was mine:
Technically, your repo will be fully reconstructed at step 4. Also, be advised the patch files themselves may have to be massaged to remove references to the file(s) in question. If the filenames themselves are not unwanted, you can add them to .gitignore for good measure.
Step 5 merely preserves the dates of the original commits. Keep in mind that for this last step, your script will have to work in reverse chronological order as the history will be altered from that point forward.
EDIT: Swap steps 1 and 2. Add advisement that patch files may require manual alteration. Add hint regarding .gitignore. Title case "Perl".
Have you looked at git-filter-branch?
The BFG program in this guide [0] seems reasonably close to #2. I don't know if you would need to manually trigger garbage collection in the remote repo, or how you'd do that.
https://help.github.com/articles/removing-sensitive-data-fro...
You'll be rewriting history via filter-branch, e.g. everyone would need to do the same for the clone and be sure not to push up the old history: https://help.github.com/articles/removing-files-from-a-repos...
For part 1 - I used this https://stackoverflow.com/questions/13403069/how-to-find-out...
Which for part 2 then leads to the BFG repo cleaner (which I haven't used)
Git bfg should do the job, it works well for that and to remove files with secrets... or really anything you want to pretend never existed in your repo :)
https://rtyley.github.io/bfg-repo-cleaner/
Have you tried this:
https://rtyley.github.io/bfg-repo-cleaner/
I've used it in the past, and it's pretty straightforward and does the job well :)
https://help.github.com/articles/removing-sensitive-data-fro...