Aaron Aaron - 1 month ago 4x
Git Question

Viewing individual commits after squashing

I noticed that after I squashed a bunch of commits in Git I am able to still view the individual commits. One of the commits that got squashed with all the others was a revert with the commit message referencing the hash of the commit that it reverted. Doing a

git show
on this hash shows me the exact content of this commit. This commit is nowhere in my history since it's been squashed into one commit.

How is this possible? Is this still lying around somewhere in the DAG? Will it eventually be garbage collected by Git when things like
git gc
are run?


Yes: whenever you do almost anything in Git, you are really adding new objects (commits and files and such) to the repository, leaving existing objects in place. The main exception is git gc, but even that leaves existing objects alone until they expire.

The exact expiration is a bit complicated. All objects normally live for at least two weeks, just so that they will not be removed during slow operations (that might take seconds or even minutes, during which the objects are not recorded anywhere).

Beyond that, object IDs (primarily commit IDs) written into references—such as commits on branches, or recorded in HEAD—normally also get written into a per-branch reflog (there's a separate log for HEAD). These are time-stamped when they are written, and such entries live for either 30 days or 90 days by default. The ones that live longer are those that are reachable from the tip of the reference: that is, for HEAD, commits that are still in the history of HEAD, and for branches, commits that are still on the branch. The shorter-lived, 30 day expiration, commits are those no longer on the branch (having been rebased and/or squashed, for instance).

These reflog entries serve to protect the objects from the Grim Reaper Collector. So this means that your old commit will be around for 30 days, not just the 14 days that everything gets.

Deleting a reference, e.g., git branch -D branch, causes its reflog to be removed as well. So if the reflog entry is only in a branch that has been deleted, the grace period may shrink back to 14 days from the object's creation.

Rebased commits are still referenced by the special name ORIG_HEAD as well, until something (usually another rebase) overwrites ORIG_HEAD. So this may protect commits well past the 30 day default.

Until git gc actually runs and removes an object, though, it will stick around. It generally will not get copied to clones, but it will potentially still be in your repository, accessible, for years, if git gc never needs to run.