PrivateJoker PrivateJoker - 1 year ago 101
Git Question

Trying to understand Git (Through VS 2017)

I have been reading and playing around with Git however I still can't understand how to work the branches correctly.

For example I have a LocalMaster which syncs back to the latest version on the network.

After I Pull everything I run:

git checkout -b BugFix

This creates a BugFix branch and sets me into that branch so far so good.

Now I start to make changes in this branch.

  1. If I don't commit why do I see these changes if I switch back to LocalMaster?

  2. Say I'm making changes in BugFix branch and I commit them to that branch only. How can I view what changes exist in this branch?

  3. Assume that I made changes in step 2. Now something else came up and I just want to make a quick fix in the LocalMaster branch. I can do that and push it back to the network. How can I synch my BugFix branch with that same change?

  4. Say I made a change in Step 2 that I don't want to merge to the LocalMaster. How can I back out that change?

Answer Source

Many GUIs hide many of the details from you, often on purpose to try to make Git look simpler. I think this is a mistake as the underlying Git details keep popping out anyway. I don't know to what extent VS2017 tries to hide things—in general, I just avoid GUIs except for special cases.

Anyway, what's going on here is that you are dealing with the fact that in Git, there are three (!) copies of every file. One is the read-only, permanently-committed version, which is part of the current commit. A second is the one in what Git calls, variously, the index, the staging area, or the cache, depending on who wrote the documentation and when.

These two versions of each file are kept in an internal, Git-ty, compressed format. (It's so well compressed that there's really only one copy most of the time.) If your GUI is good, it will let you view both of these despite them being in a format that only Git itself can deal with directly.

The third copy of a file is the one in your work-tree. This is an ordinary file, in an ordinary file system, stored in such a way that ordinary programs can deal with it in the ordinary way. This is the only version of the file that really takes very much room (except for some binary files that Git does not compress very well). So the three copies are mostly just a management headache.

Obviously, the work-tree version of the file is something you can change. It's just an ordinary file, after all. You can edit it, rewrite it, or even remove it. You can also create new files, which aren't yet in the index and maybe are not in any commit either.

The index version of each file is also write-able. You write it using git add: this copies whatever is in the work-tree right now, into a file of the same name in the index. If there's already one in the index, this replaces the index copy. If there is not one in the index yet, this puts one into the index.

The most important way that you deal with the index, other than by using git add to copy things into it, is what happens when you run git commit. At this point, Git looks at your index, not your work-tree, to make the new commit. Whatever files are in the index, those files go into the new commit. If you remove a file from the index (using git rm), that file is not in the new commit. The contents of each file are whatever you copied into the index: no more and no less.

Once the new commit is safely saved away in the repository, Git changes the current branch (as recorded in HEAD) so that the branch names the new commit. The new commit's parent is whatever commit was the current commit before. Since Git just made the new commit using the index, the commit and the index now match.

This sets the scene for your four questions:

If I don't commit why do I see these changes if I switch back to LocalMaster?

Changes made in the work-tree and/or the index are just in the work-tree and/or in the index.

When you ask Git to check out some other commit, Git compares the current (HEAD) commit to that other commit. If some file differs, Git has to replace the index and work-tree versions. If not, Git can leave it alone.

Let's go through a short example, where you ask Git (via git checkout) to move from commit badbeef... to commit ac0ffee.... There are three files in each: README, a.txt, and b.txt. The version of README in both commits is the same. The versions of a.txt and b.txt are not.

To do this move, then, Git must swap out your existing a.txt and b.txt, in both the index and the work-tree. Is that safe? Well, if you made no changes to a.txt, that part is safe. If you made changes to b.txt, it's not, and you get an error.

But README is the same in both commits. Git doesn't have to swap it out. So any changes you made to README, in index and/or work-tree, can be left in place. As long as a.txt and b.txt are untouched, Git can replace them; since README doesn't need replacing, Git can leave it untouched.

This lets you carry an uncommitted README change across those two commits, but not any uncommitted a.txt or b.txt. Those two aren't safely saved away unless they match what's in the commit-before-switching.

Say I'm making changes in BugFix branch and I commit them to that branch only. How can I view what changes exist in this branch?

When you make a commit, that's a complete and total snapshot (of whatever is in the index). The commit itself causes the current branch name to change, so that the name resolves to the new commit. There aren't really any "changes" in this at all: it's a snapshot. To find out "what changed" you must pick some other snapshot and get Git to compare them.

It's easy to get Git to compare any commit against its immediate parent commit: that's what git log -p or git show will show, as "the changes". To show those as changes, though, Git has to re-compute the difference, from the two snapshots, every time.

If you want to see every change since some particular earlier commit, you need git diff. This takes two snapshots and compares them. It's just like git log -p except that with git log you're always comparing parent and child, and with git diff you're comparing two commits you choose.

Git doesn't keep track of "where a branch started" so to find "changes on a branch", you must define your own starting point.

Assume that I made changes in step 2. Now something else came up and I just want to make a quick fix in the LocalMaster branch. I can do that and push it back to the network. How can I synch my BugFix branch with that same change?

Alas, now we get into the commit graph. This is actually where some GUis shine (and some are really awful...).

We noted above that when you make a new commit, this advances the current branch name to point to the new commit. Let's draw a repository with three commits in it:

A <-B <-C   <--master

Instead of big ugly hash IDs, these commits have easy one-uppercase-letter names. Commit C is the newest, so the name, master, remembers its hash ID. Commit C remembers the hash ID of its parent B, and B remembers the hash ID of the first commit, A.

Since A was the first commit, it has no parent at all. It is a root commit (which is just a fancy way of saying "a commit with no parent").

The internal arrows within commits are fixed for all time, as is everything else about a commit. (There are good technical reasons for this: basically, the hash ID is itself a cryptographic checksum of the contents of the commit, so if you change anything at all, you get a new, different commit.) So the internal arrows are not very interesting—we just need to remember that they always point backwards. So:

A--B--C   <-- master

The branch names, however, change over time. This one currently stores the hash ID for C. If we make a new commit, we get:

A--B--C--D   <-- master

and now master stores the hash ID for D. (To find C, Git starts at D. D stores the ID for C—"points back to" C—and that gets us to C, and then B, and so on.)

When you make a new branch, Git copies some ID into the new branch name. By default, we start with the current commit (which is now D):

A--B--C--D   <-- master, newbranch (HEAD)

Now that there are two names, we need to know which one is the current one. Git stores this in the special name HEAD (actually a file, .git/HEAD): HEAD literally contains the name of the branch.

Now if we make a new commit, Git makes the new commit as usual, and updates the current branch, which is newbranch:

A--B--C--D   <-- master
           E   <-- newbranch (HEAD)

and we have a new commit on a new branch that, visually, looks branch-y. Except that it's just a straight line with a kink in it, really. To make it properly branch-y, we must check out master (so that HEAD says master) and make another new commit:

A--B--C--D--F   <-- master (HEAD)
           E   <-- newbranch

and now we really do have a branch.

The key here is to note that the actual branching is a property of, not the names (master vs newbranch), but the commits and their embedded graph links. The names just let us get started into the graph, after which we do all this backwards-link-following.

So, to properly answer question 3, we have to see what the graph looks like. It might look like this:

...--G--H--I   <-- LocalMaster (HEAD)
             J   <-- BugFix

You can now make a new commit on LocalMaster:

...--G--H--I--K   <-- LocalMaster (HEAD)
             J   <-- BugFix

Now you'd like BugFix to start from commit K. You can't change J at all, but you can copy it, to a new temporary branch. After you copy to a new similar commit, you get this:

                J'   <-- tmp (HEAD)
...--G--H--I--K   <-- LocalMaster
             J   <-- BugFix

(using the name J' to indicate that it's a copy of J). You can now start ignoring your original J by forcing your Git to re-locate the name BugFix to point to J':

                J'   <-- BugFix (HEAD), tmp
...--G--H--I--K   <-- LocalMaster
             J   [abandoned]

and now you don't need the temporary name, so you can just delete it.

The command that does all of this, all at once, for you is git rebase.

You do not have to rebase, though, and there are cases where it's unwise. In particular, suppose you made commit J available to others, pushing it back somewhere. Those other people now have their copies of commit J. You propose to replace it with this new J'. You must get everyone else to do this same replacement! Otherwise your old J can come back: they may think that the original J is something important, not a duplicate of J', and re-introduce it later.

When rebasing is not an option, you can instead use git merge. What merge does is complex, but in the end, it makes a new merge commit, which is a commit that points back not to one parent, but to two. You might start, as you did before, with:

...--G--H--I--K   <-- LocalMaster (HEAD)
             J   <-- BugFix

You then check out BugFix and run git merge LocalMaster. This figures out how to combine the changes since the merge base (commit I, where the branches come together) with those at the two branch tips (commits J and K, in some order—the order doesn't matter for the combining, but does matter later). If the combining succeeds, Git makes the new merge commit. Its first parent will be J, as that's the commit that is HEAD during the git merge. The result looks, graph-wise, like this:

...--G--H--I--K   <-- LocalMaster
            \  \
             J--M   <-- BugFix (HEAD)

Say I made a change in Step 2 that I don't want to merge to the LocalMaster. How can I back out that change?

You're suggesting here that you made a change while on BugFix and committed it. There is nothing to back out yet:

...--G--H--I   <-- LocalMaster
             J   <-- BugFix

It doesn't matter, here, which branch is current. The name LocalMaster records commit I, so that commits up through and including I are on LocalMaster. The name BugFix records commit J, so that commits through G--H--I--J are on BugFix. Note that many (all but one, in this case) commits are on both branches—this is another way that Git is peculiar.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download