Johan Fredrik Varen Johan Fredrik Varen - 3 months ago 12
Git Question

How do I debug a large git commit?

Ok, here's the case:

A couple of years ago, multiple changes was made to multiple files in our codebase, and was committed all at once. Somewhere in those changes hides a bug. Using git bisect, I was quickly able to track down the culprit commit, but the amount of changes in that commit made me a little less enthusiastic.

Finding a bad commit is a breeze with git bisect, but once found, what is the best way to track down the single change that made it all go boom? Revert the affected files to their previous version, one by one?


This can be pretty tedious unless you understand very well all the changes that occured within that large commit.

Typically a very large (bad large) commit involves many different changes. What you need to do is isolate all those changes conceptually and rewrite different commits.

I suggest breaking down changes according to these 4 criteria:

[NEW] involves all code related to a singly identified technical-level feature (as opposed to user-level which may involve more than one tech-level feature)

[RFG] any behavioral invariant changes. Preserves executed behavior and APIs (interfaces)

[CHG] implementation of anything that represents a change of specifications/requirements

[FIX] any changes that may change behavior to make it conformant to the intentions behind the codeing.

then, git-wise here is what you need to do:

  git checkout <bad commit SHA1> -b CULPRIT

this will create a "CULPRIT" branch. I always keep this one as a reference as you may need to perform many tedious itteration of the following steps. As a sidenote, keeping partial references along the way help (as branches or tags).

  git reset HEAD^ --mixed

this will undo the commit as if all the changes from the commit were applied as patch of unstaged changes to the previous commit. Then using

  git add --patch

you can change a subset of those changes. Do not hesitate to use the [s]plit option to singly pick the changes line by line. Sometimes, you can't avoid an edition to hand pick the changes you want. And restage as multiple commits broken down in the NEW,RFG,CHG,FIX scheme I suggest above and rewrite as many commits as you want.

Be mindful of:

  • staging a new commit that do not compile
  • staging a new commit that produce "trivial" runtime error (such as segfault for example)
  • sub commits that needs to be combined to make the thing work

...because the goal is to make bisect work. Furthermore, ensure your new commits are the same as the old commit by git diff the new HEAD commit againts CULPRIT to ensure you did not introduced further changes.

This is a pain at first and requires a lot of practice but once you become good enough at doing this, you will become a debugging god to be worshipped by your whole team and then can spread the gospel of small commits in the form of NEW,CHG,RFG and FIX.