devoured elysium devoured elysium - 1 month ago 7
Git Question

Squashing a sequence of small merges from master into my branch with git while keeping reference to master?

I had a very complicated merge to do. Part of the problem lay that I let too much time pass by so the amount of changes to incorporate into my branch was monstrous.

To make things easier, I opted by doing

git merge origin/master~20
, then
git merge origin/master~17
,
git merge origin/master~15
, etc, so I could do the conflict resolutions in a piecemeal fashion instead of having to take it all at once.

The problem is tha this lead to a pollution of the log history that I would like to get rid of. What is the best approach to merge all these commits while still keeping the resulting commit pointing to both my branch and master?

I usually squash by using
git reset --soft
but that would not leave a reference to the master branch. I also tried to
git rebase -i --preserve-merges
but I got
"Refusing to squash a merge"
error messages.

How should I proceed?

Answer

Let me describe your situation this way: You have the merge result you want—the source tree—but not the history you want that leads to this result.

As VonC has put it elsewhere, and you attempted yourself, git reset --soft usually is the answer. You do the soft reset, then make a new commit. If only you could make a merge commit at this point, it would still be the answer.

There are three easy ways to do this without git rerere. One is by cheating, and one of the ways is "documented cheating" and is therefore probably the right way.

Method 1: clunky, but uses all normal tools (no cheating)

Note that the command sequence here assumes you are in the top level of your repository (in particular the git rm and git checkout steps below refer to . to mean "everything"). I also use $startpoint for the commit after which you want to have your merge, and $other to refer to the other branch-name or commit ID (the one you want to git merge).

  1. Save the ID of the final result (we want the tree, but the standard tools make it easiest to just refer to the commit, which also works fine):

    $ git tag temp-save-result
    

    (or use cut-and-paste or the reflog to save it; I show a tag just for simplicity here).

  2. Reset. This might as well be --hard, rather than --soft:

    $ git reset --hard $startpoint
    
  3. Run the merge, which will fail with conflicts. Ignore the conflicts and remove the entire index and work-tree. We don't want the conflicted merge, or any of the temporary results so far, because we have the proper results elsewhere.

    $ git merge $other
    $ git rm -r .
    

    (if you have some custom merge tools that leave droppings behind, you might want to clean those out of the work tree here as well, although they won't affect anything important: they'll just be cluttering up your work tree).

  4. Extract the work-tree saved in step 1, and commit the result:

    $ git checkout temp-save-result -- .
    $ git commit
    

This commit concludes the merge, whose tree comes from the tree you saved in step 1. You can now delete the tag:

$ git tag -d temp-save-result

Method 2: cheat

When git commit makes a new commit, it makes a merge commit if .git/MERGE_HEAD exists. The MERGE_HEAD file contains the ID of the second commit, i.e., the other or remote or --theirs that is being merged-in.

So, we simply do a soft reset as usual, then add the merge ID, then commit. (NB: I have not tried this lately, and Git might also want .git/MERGE_MSG. Be prepared to need to tweak the cheat, or just move on to method 3.)

$ git reset --soft $startpoint
$ git rev-parse $other > .git/MERGE_HEAD
$ git commit

The first command is our usual git reset --soft step, the second lies to Git to say that we're resolving a conflicted merge (and the index is all resolved, so we must be done with that step), and the git commit now commits the merge.

Method 3: use a plumbing command ("documented cheating")

The command that makes an actual commit object—git commit was no doubt once just a shell script that ran it near the end—is git commit-tree. It requires:

  • a tree object that contains the desired tree
  • a commit message (it will read one from stdin, but probably you should use -m or -F)
  • the parent IDs for the parents of the new commit

and it writes the new commit object to the repository, and prints out the object's hash ID.

We already have the tree! Since it's the current commit, its ID is HEAD^{tree} (using gitrevisions syntax). We have the two parent IDs as well. All we need is the message:

$ tmp=$(git commit-tree -p $startpoint -p $other -m 'merge msg' HEAD^{tree})

Once we have the commit ID we just need to git reset --hard our current branch to point to it:

$ git reset --hard $tmp

(Of course, you can combine the two into one big command using $(...) instead of $tmp in the second command, although that assumes your git commit-tree command will work: it's probably better to do two steps for personal comfort. You can omit the $tmp variable, and the corresponding shell $(...) syntax, by cutting and pasting the hash ID, if you like that better. And, you can put in a cheesy temporary commit message, then edit it with git commit --amend once you have reset to it: the --amend option works on merge commits too.)