UpAndAdam UpAndAdam - 2 months ago 13
Git Question

How to list branches that contain an equivalent commit

In a prior question someone provided an answer for finding branches that contained an EXACT commit:

How to list branches that contain a given commit

The accepted answer highlighted that this only works for an EXACT commit id, and not for an identical commit. It was stated further that Git Cherry can be used to solve this.

Git cherry SEEMS to be geared for the reverse; finding commits NOT pushed upstream. This is useless if I don't know which branch created it and what is upstream of what. So I don't see how it's going to help solve this problem.

Can someone explain / provide an example of how to use git cherry to find all branches that contain the 'equivalent' of a specific commit?

Answer

Before you can answer the question of which branches contain an equivalent commit you have to determine "which commits are equivalent". Once you have that, you simply use git branch --contains on each of the commits.

Unfortunately, there is no 100% reliable way to determine equivalent commits.

The most reliable method is to check the patch id of the changeset introduced by the commit. This is what git cherry, git log --cherry, and git log --cherry-mark rely on. Internally, they all call git patch_id. A patch id is just the SHA1 of the normalized diff of changes. Any commit that introduces identical changes will have the same patch id. Additionally, any commit that introduces mostly identical changes that differ only in whitespace or the line number where they apply in the file will have the same patch id. If two commits have the same Patch ID, it is almost guaranteed that they are equivalent - you will virtually never get a false positive via the patch id. False negatives occur frequently though. Any time you do git cherry-pick and have to manually resolve merge-conflicts you probably introduced differences in the changeset. Even a 1 character change will cause a different patch id to be generated.

Checking patch ID requires scripting as Chronial advises. First calculate the patch id of the Original Commit with something like

(note - scripts not tested, should be reasonably close to working though)

origCommitPatchId=$(git diff ORIG_COMMIT^! | git patch_id | awk '{print $1}')

Now you are going to have to search through all the other commits in your history and calculate the Patch IDs for them, and see if any of them are the same.

for rev in $(git rev-list --all)
do
   testPatchId=$(git diff ${rev}^1 | git patch_id | awk '{print $1}'
   if [ ${origCommitPatchId} == ${testPatchId} ]
   then
      echo "${rev}"
   fi
done

Now you have the list of SHAs, and you can pass those to git branch -a --contains

What if the above doesn't work for you though, because of merge conflicts?

Well, there are a few other things you can try. Typically when you cherry-pick a commit the original author name, email, and date fields in the commit are preserved. So you will get a new commit, but the authorship information will be identical.

So you could get this info from your original commit with

git log -1 --pretty="%an %ae %ad" ORIG_COMMIT

Then as before you would have to go through every commit in your history, print that same information out and compare. That might give you some additional matches.

You could also use git log --grep=ORIG_COMMIT which would find any commits that references the ORIG_COMMIT in the commit message.

If none of those work you could attempt to look for a particular line that was introduced with the pickaxe, or could git log --grep for something else that might have been unique in the commit message.

If this all sounds complicated, well, it is. That's why I tell people to avoid using cherry-pick whenever possible. git branch --contains is incredibly valuable and easy to use and 100% reliable. None of the other solutions even come close.

Comments