Say that I have the following scenario:
A -- B -- C -- D -- H master
E -- F -- G topicA
I -- J -- K topicB
rebase --onto master topicA topicB
As DavidN said in a comment, your plan looks sound enough.
(The rest of this is far too long and rambly; it was written between / during other tasks.)
H was created by
git merge --squash, the drawing is wrong. It should read:
A -- B -- C -- D -- H master \ E -- F -- G topicA \ I -- J -- K topicB
The key difference is that commit
H is not related to the
E--F--G sequence, at least not in any Git-detected sense. Commit
H's content is affected by whatever happened in the
E--F--G sequence (and, of course, by whatever happened in the
C--D sequence) but as far as Git knows now, someone came along and wrote
H without even looking at
I'm going to have a fairly big digression here now.
If I now merge
OK, let's draw that as a commit graph, to make sure that this means what you intend. I will use my usual form (slightly more compact, with arrows from branch-names scooted over to the right a bit):
A--B--C---D----H <-- master (still points to H) \ \ E--F--G \ <-- topicA (still points to G) \ \ I--J--K--L <-- topicB (points to new L)
Note that I drew a real merge, not a fake, not-a-merge-at-all "squash merge". This really does matter, as we will see.
When Git goes to make this new commit
L, it has to merge commits
K. To do that, it has to find their merge base (in some cases there can be several merge bases, but here there's just one).
The merge base(s) of any two commits is / are the Lowest Common Ancestors: that is, the commits closest to the two starting commits (
K) that are reachable from both of those starting commits.
Let's start with
K themselves first.
H is reachable from
H (of course) but not from
K is reachable from
K (of course) and but not from
H. Now we can check
D is reachable from
H but not from
K. Now we can check
J, but it's not reachable from
H. Now we consider
E, but it's not until we get all the way back to commit
B that we find a commit reachable from both
A would also work, but it's further away from both
K, so commit
B is the merge base.
The merge then starts with two diffs:
git diff B H
git diff B K
The first diff shows what we changed going from
H. Of course,
H has what we changed in
D, plus whatever we changed in
G. The second diff shows what we changed going from
K. Of course,
K has what we changed in
E, plus whatever we changed in
This has whatever we changed in
E twice, but Git usually—not always, but usually—does a good job of noticing that and picking up the change only once. So commit
L probably has everything from every previous commit, done just once.
and then do a
Note that this is using the three-dot
... syntax, not the two-dot
.. syntax. I'm not sure what you intend here, but the three-dot syntax essentially means "find the (or a) merge base". So let's go through this exercise again:
master still points to commit
topicB now points to the new merge commit
L, and we find the merge base of
L, now that we have a real merge (none of this stupid "squash merge" fake merge stuff for us, no way!).
So let's start with
L themselves first.
L is reachable from
L (of course) but not from
H is reachable from
H (of course) and also from
L. This means the merge base of
H: the merge base of
topicB is master.
master is on the left of the triple-dot, it's replaced with the merge base, which is commit
H. The right side of the triple-dot is resolved to its commit, which is commit
L. The diff then shows you whatever is different between
In this case, the effect is the same as for
git diff master..topicB, which means the same thing as
git diff master topicB: compare commits
L, in that order.
That should be a pretty sensible diff, in spite of the horrible fake squash-merge we did initially to make
H. The real merge sort of repaired this, at least for
Let's draw this thing yet again but this time using the fake not-a-merge
git merge --squash technique. The contents of our new commit
L will be the same as if we had done a real merge, but the graph will be different:
A--B--C---D----H <-- master (still points to H) \ E--F--G <-- topicA (still points to G) \ I--J--K--L <-- topicB (points to new L)
Now we go back to:
Once again, we need to find the merge base between
L, but now
L does not point back to both
H, but only to
L is the merge base. Neither
J work either: we can't walk backwards to
L, and we can't walk backwards to
H. In fact, the merge base commit is again commit
B, so this means the same thing as:
git diff B L
and this diff will be quite different.
I do not know what you were expecting from your diff, so I cannot address this part:
the diff is messed up and contains a lot of changes or undoings which shouldn't be there.
Now let's return to the question:
In that case, I usually merge
topicBbefore doing what was said in the previous paragraph. However, sometimes that it's not possible (e.g. the branch was deleted) and I end with a lot of conflicts.
Note that deleting a branch name has no immediate effect upon its commits. What it does do is to stop protecting those commits. That is, because each branch name makes commits reachable, those commits are safe from the
Grim Collector ...er... Grim Reaper Garbage Collector. We did that reachability thing several times to find merge bases; Git does it even more often, though, to find commits to keep and commits to discard, during GC; commits to transfer, during
fetch; and so on. If the commits are protected by some other means—by reachability through a real merge, or by reachability from another branch or tag name, or whatever—they stick around. If you can find them by hash ID, you can bring them back.
More importantly for your
rebase case, if you can find them by
git log, you can cut them off. We'll see this in a moment.
Because squash "merges" are not actually merges at all, they won't protect the other chain of commit, and—this is usually the key to future the merge conflicts—they do not provide future merges with updated merge bases. This means those future merges must examine huge diffs, instead of small diffs, and then Git's automated "redundant change" detection fails.
What this means in practice depends on how you use these squash not-a-merge "merges". When you use them to take a line of development and reduce it to a single commit, it's probably a good idea to stop using that other line of development entirely. You can save it (using a branch or tag name, or even some other reference outside the branch and tag name spaces so that you don't normally see it, that keeps the commit chain from being GC-ed) or just let it get reaped, but either way you probably should not continue working on it, and that includes any other branches you have that fork off from some commit(s) on it.
rebase --onto master topicA topicBthe right solution?
git rebase, you can copy these other chains—your
topicB, in this case—to new chains and then point the label (
topicB) to the tip of the copied chain. The commits you want to copy are those that were not squashed: here, that's the
I--J--K chain. Using
topicA as the
<upstream> argument to
git rebase will select the right set of commits. Note that
topicA reaches commits
E, and so on; so using
<upstream> chops off everything from
F on back, but then requires the explicit
--onto that you provided.
If the label
topicA were deleted, you could still do this rebase, it just gets trickier. What you would need to do is to specify either of commits
F by their hash IDs, so as to chop off commits
F and earlier. The hash ID of
G is anywhere from hard-to-find (GC has not deleted it but it is unreachable from any live reference) to non-existent (GC has deleted it). The ID for
F, however, is right there in the
K's parent is
J's parent is
I's parent is
F. The problem is that there is no easy way to determine that commit
F was in the set of commits that were in the chain that the earlier
git merge --squash handled.
(This is related to, but not quite the same thing as, the earlier remark I bolded.)