reniaL reniaL - 1 month ago 11
Git Question

Git commit lost after merge

We have 3 branches (A, B, C) as below:

---\--A1--\------Am------Merge1---An---Merge2---
\ \ / /
\ \--C1---C2---/ /
\ /
\--B1--------------Bn---------/


The problem appears at Merge2. Some commit on branch C ( not all but some, let's say C2) is lost on branch A after Merge2, which is present between Merge1 and Merge2.

When doing Merge2, there is only one file conflict, which not relates to the lost commit (C2). And we resolve the confilict and finish the merge successfully.

It seems like C2 is reversed by Merge2 on branch A, without any log.

What happened? What might be the cause to this situation?

Answer

What you mean is not that the commit itself is lost, but rather that the changes made (to some file) in that commit have been reverted (un-made).

It's worth noting here that commits don't really "make changes to" files. Instead, each commit stores a complete set of files, whole and intact. When you ask Git to show you some commit (let's say its ID is c2c2c2...):

$ git show c2c2c2

what Git does is:

  1. extract commit c2c2c2...
  2. extract the parent commit of c2c2c2...
  3. produce a diff listing from parent to child

This is how Git manages to show you what changed: it compares "what you had just before that commit" to "what you had as of that commit". (Git can do this pretty quickly, optimized-ly, because every file is reduced to a unique hash "fingerprint", and Git can first just compare the fingerprints (hashes). If the hashes are the same, the files are the same. It only really has to bother extracting the actual file data if the hashes differ.)

This process—saving whole files, instead of accumulating changes—is called "storing snapshots". (Other version control systems tend to accumulate changes, which is called "storing deltas". They do this because saving the changes to files obviously takes far less space than saving the files. Git sneaks around the issue in a clever way and winds up using less disk space than older delta-based version control systems anyway.)

Why "Git stores snapshots, not deltas" matters so much here

A merge commit is special, in one particular and obvious way. Look at your own diagram and consider Merge1. What commit comes right before it?

The answer here is that both Am and C2 come "right before" Merge1. That is, commit Merge1 has two parent commits.

If you ask Git to show you commit Merge1, which parent should it compare-with?

Here's where things get particularly odd. The two commands git log -p and git show seem very similar. In fact, they are very similar. The one obvious difference is that git log -p shows more than one commit, while git show shows just the one commit you tell it to show. You can run git log -n 1 -p <commit> to show just the one commit, and now it seems like these are exactly the same.

They're not!

When you use git show on a merge commit, Git tries to solve the "what commit to compare against" problem by comparing, simultaneously, against both parents. The resulting diff is called a combined diff.

When you use git log -p on a merge commit, though, Git just throws up its metaphorical hands, says "I can't show patches against two parents", and gives up and goes on to the next commit. In other words, git log -p doesn't even bother trying diffs for the merge.

But wait, there's more

Now, in this case you might be tempted to see if you can figure out what happened to your file from commit c2c2c2... using git show on the two merges—in particular, on Merge2, where the changes got reverted. But git show produces, by default, a combined diff, and a combined diff deliberately omits a lot of diff output. In particular, a combined diff lists only files which were modified from all parents.

Let's say the file where your changes from C2 were reverted is file f2. And, from the graph, the two parents of Merge2 are An (which has f2 the way you want it) and Bn (which doesn't).

What actually happened here is that, during the merge that created Merge2, you somehow told Git to use the version of f2 from commit Bn. That is, file f2 in Merge2 is exactly the same as file f2 in Bn, and different from f2 in commit An.

If you use git show to view Merge2, the combined diff will skip f2, because it is the same as the f2 in Bn.

The same is true, only even worse, with git log -p: it skips the merge entirely, because it's just too hard to show diffs.

Even without -p, when you ask for "files changed", git log winds up doing the same thing—skipping the merge entirely. That's why you can't see it in the log output.

(As an aside, the reason git log master -- f2 never shows commit C2 itself is that adding a file name to the options to git log turns on "history simplification". In what I consider to be somewhat buggy behavior, Git winds up simplifying away too much history, so that it never shows commit C2. Adding --full-history before the -- f2 restores C2 to the set of commits shown. The merge is still missing, though, because git log skips it.)

How to see the change

There is a solution. Both git show and git log take an additional flag, -m, which "splits" merges. That is, instead of treating Merge2 as a merge commit, these will break the merge into two "virtual commits". One will be "Merge2 vs An", and you will see all the differences between those two commits, and the other will be "Merge2 vs Bn", and you will see all the differences between those two commits. This will show that file f2 got re-set to the way it is in Bn, losing the version from C2 that appears in An but not in Bn.

(Include --full-history as well as -m to ensure that commit C2 shows up as well.)

How this happened in the first place

This part is not clear, at all. You said there was a merge conflict, though, which means git merge stopped and got manual assistance, from a human. At some point during this assistance, the human probably updated the index with the version of file f2 from Bn (or at least, a version of f2 that did not have the change made back in C2).

This can happen during merges, and it's a bit insidious precisely because Git shows merges with these compressed (combined) diffs, or in the case of git log -p, not at all, by default. It's something to watch out for, especially if a merge required manual conflict resolution. In most cases, the way to catch this sort of merge error is with automated tests. (Of course, not everything can be tested.)

Comments