My Git repo has hundreds of gigabytes of data, so I'm trying to remove old, outdated commits, because they're making everything larger and slower. I need a solution that's fast; the faster, the better.
How do I squash all commits except for the most recent
ones, and do so without having to manually squash each one in an interactive rebase? Specifically, I don't want to have to use
git rebase -i --root
A .. B .. C ... ... H .. I .. J .. K .. L
A .. H .. I .. J .. K .. L
Fastest counting implementation time is almost certainly going to be with grafts and a filter-branch, though you might be able to get faster execution with a handrolled commit-tree sequence working off rev-list output.
Rebase is built to apply changes on different content. What you're doing here is preserving contents and intentionally losing the change history that produced them, so pretty much all of rebase's most tedious and slow work is wasted.
The payload here is, working from your picture,
echo `git rev-parse H; git rev-parse A` > .git/info/grafts git filter-branch -- --all
Filter-branch is very careful to be recoverable after a failure at any point, which is certainly safest .... but it's only really helpful when recovery by simply redoing it wouldn't be faster and easier if things go south on you. Failures being rare and restarts usually being cheap, the thing to do is to do an un"safe" but very fast operation that is all but certain to work. For that, the best option here is to do it on a tmpfs (the closest equivalent I know on Windows would be a ramdisk like ImDisk), which will be blazing fast and won't touch your main repo until you're sure you've got the results you want.
So on Windows, say
T:\wip is on a ramdisk, and note that the clone here copies nothing. As well as reading the docs on
--shared option, do examine the clone's innards to see the real effect, it's very straightforward.
# switch to a lightweight wip clone on a tmpfs git clone --shared --no-checkout . /t/wip/filterwork cd !$ # graft out the unwanted commits echo `git rev-parse $L; git rev-parse $A` >.git/info/grafts git filter-branch -- --all # check that the repo history looks right git log --graph --decorate --oneline --all # all done with the splicing, filter-branch has integrated it rm .git/info/grafts # push the rewritten histories back git push origin --all --force
There are enough possible variations on what you might be wanting to do and what might be in your repo that almost any of the options on these commands might be useful. The above is tested and will do what it says it does, but that might not be exactly what you want.