I'm learning in my tutorial about git and it recently gave out the output for cat .git/config for tracking remote branches as seen below. I understand that branch "master" is the master branch and origin refers to the remote tracking branch in the local computer but can someone explain what the fetch, merge and remote options are (I understand the rest)?
$ cat .git/config
repositoryformatversion = 0
filemode = false
bare = false
logallrefupdates = true
symlinks = false
ignorecase = true
url = https://github.com/rich44/explore_california.git
fetch = +refs/heads/*:refs/remotes/origin/*
remote = origin
merge = refs/heads/master
Technically, each of these is just a section with variables, so it's not really true that, e.g.,
[branch "master"] is the master branch, it's just a pair of settings that can be spelled out as:
branch.master.remote origin branch.master.merge refs/heads/master
Your question effectively works out to some number of parts, which I'll try to take in a sensible order, although Git being what it is, sometimes no order is sensible. :-)
Git likes1 to talk about "sections" in the configuration. Specifically, the
git config command has
--remove-section. A section is basically just the part inside square brackets, which is a syntax stolen originally from INI file format.
Beyond this, however, Git internally doesn't actually care about sections. Each program simply queries with either a fixed string, such as
branch.master.remote, or using a regular expression or similar, such as
core\..*, which matches everything in the
[core] section. Here the backslash is required to protect the first dot
. character; the second
. character means "match anything" and the asterisk
*—called a Kleene star in informatics theory—means "repeat it zero or more times". Hence this matches any string starting with
core. and continuing on for zero or more characters. (Most of the C code inside Git that matches items in sections is considerably cruder than this; only
git config itself and various scripts that run
git config really use the full power of regular expressions.)
The ability to scan for matches allows Git to, for instance, find all remotes, which are simply all the names in each
[remote "..."] section. (The double quotes here evolved because INI-file syntax forbids certain characters that are allowed in remote names and branch names and so on. Fortunately, remote and branch names cannot contain double quotes: if they could, we would have to wonder how these would be encoded.)
1I realize I'm anthropomorphizing Git here. It's a useful metaphor. But remember, don't anthropomorphize computers: they hate that!
Because INI syntax is so flexible, you can put anything you like into this section. Things Git does not know or care about, Git simply ignores. For instance, you can edit your config to contain:
[remote "origin"] abracadabra = magic word hello = kitty
and these two settings will be ignored. So it's more interesting to find out what items Git will actually pay any attention to. (Any list we make is necessarily incomplete, because Git grows new "items to pay attention to" over time: at one point, Git looked for
url but not for
pushurl. At a previous $job, many years ago, I added code to our Git scripts to check for a
pushurl. Then a new version of Git, 1.6.4, came out, that used
pushurl in precisely the same way I had used it. [Clearly we had the same idea.] If we choose our new items well, we'll probably end up agreeing with the Git folks as to their meanings, as happened here, but it's always a bit of a risk, adding new things.)
Nonetheless, here's a partial list, mostly copied from the
git config documentation:
url: The default place to fetch from, and usually, to push to.
pushurl: If set, the default place to push to.
mirror: If set, make pushes to this remote use
proxy: Provide a proxy setting for libcurl.
fetch: Provide a default refspec for
git fetch(may be repeated).
push: Provide a default refspec for
git push(may be repeated).
prune: If set, make fetch from this remote use
The one you specifically asked about—the
fetch line—supplies refspecs for
git fetch. See below for more about refspecs.
Again, there is quite a long list of things you can set that Git does actually pay attention to. These are mostly, though perhaps never completely, documented in that same
git config documentation.
remote: The default remote to use when fetching and pushing.
merge: The name of the branch as seen on the remote Git, to use when fetching and pushing. This gets a bit complicated in complicated cases: see the description of refspecs below.
rebase: A setting (default
false, other options include
preserve) for how
git pullshould run its second half (remember that
git pullis essentially shorthand for "first run
git fetch, then run another, second, Git command").
description: A short descriptive string inserted into
format-patchcover letters and pull requests.
Git originally had branches and tags. To keep them separate, Git stored branches in
refs/heads/ and tags in
refs/tags/. These were—and in fact, still are—just directories within
.git, although today there are also "packed references" stored in
.git/packed-refs, which saves time and disk space if you have thousands of rarely-updated branches and never-updated tags.
What this means, though, is that when you see a branch name like
master, Git actually sees
refs/heads/master. This is not only where Git stores the hash ID for the branch, it's also how Git knows that it is a branch in the first place. You type in
master; Git searches in
.git/packed-refs, and comes up with
refs/heads/master; and the
refs/heads/ part tells Git: aha, this is a branch!
When Git acquired remote-tracking branches, they were easy to add, because of this little bit of planning ahead. A remote-tracking branch is simply a name stored in
refs/remotes/. To this prefix, Git adds the name of the remote itself, so that
origin/master goes in
refs/remotes/origin/master. This full name is what tells Git that the name is in fact a remote-tracking branch in the first place.
To really understand refspecs, it helps to know a bit of Git history. Back in the Dim Time, before Git had remotes and remote-tracking branches, people had a collection of various kludges to deal with pushing and pulling changes from other Git repositories. There was a
git fetch then, but it was a difficult-to-use "plumbing" program, meant mainly for scripts to use. It dumped its results into a file called
FETCH_HEAD. The front end "nice" command was
git pull. This is why Git has
pull as the obvious—but wrong!—opposites, and why people are usually introduced to transferring data with
pull instead of using
fetch, which is actually the better way to go at first. The nice version of
fetch did not exist yet. Today's
git fetch still dumps its results into
FETCH_HEAD, but also behaves better.
Because remote-tracking branches did not exist, refspecs did not really need to exist then either. If you were fetching from Bob's repository, you had a nicer front end script that ran
git fetch. The fetch reached into Bob's computer somehow (via
https or whatever, just as is still done today) and extracted his branches and tags and dropped everything into your
FETCH_HEAD file. Your front-end script then extracted whatever looked interesting and let you merge, or rebase, or whatever you intended to do.
Note that during this process, you don't have to care one bit how Bob names his branches. Your front end script just looks into
FETCH_HEAD. That file is completely separate from your branches, and you—or your front end script—can throw away Bob's branch names as soon as you are done looking.
Your nice front-end pull script, of course, does need to know Bob's name for your branch. Let's say you call your branch
paris, while Bob calls his
asteroid_433. You always want to merge his into yours, so you just configure your
This whole process was pretty messy. Someone came along and invented remote-tracking branches, which really is quite a brilliant idea. Instead of having to remember that Bob calls this
asteroid_433, why not just get everything from Bob? For each name you get, if it's a branch—if it starts with
refs/heads/—just drop it into your repository under the name
refs/remotes/Bob/whatever. Now you can easily see all of Bob's branches any time. You'll still need to remember the mapping ("my
paris = Bob's
asteroid_433") but most of the time you two will probably use the same name.
Imagine you're the guy inventing this wonderful new feature. Of course, the
pull script already exists. You can't change
branch.paris.merge: it's still going to say
refs/heads/asteroid_433, which is the name on Bob's computer. And, you're not sure if this is how you want to do it. Maybe you'd like to have
git fetch grab Bob's
asteroid_433 and rename it to
paris, so that you get
refs/remotes/Bob/paris as the remote-tracking branch.
Enter the refspec.2 The refspec is, in this case, simply a pair of names separated by a colon, and optionally prefixed with a plus sign. The name on the left is the "source" and the name on the right is the "destination". The source is Bob's names for his branches, and the destination is your remote-tracking branches for your remote named
To make this work nicely, you put in pattern matching. For whatever reason, you don't use regular expressions, but rather use shell style glob expressions. (In versions of Git before 2.9 or so these are even further limited. They work well enough though.) You still have the name on the left,
refs/heads/*, and the name on the right,
refs/remotes/bob/*, but now the left side
* means "match anything", and the right side
* means "replace this with what you matched on the left".
This produces your remote-tracking branches, which
git fetch now updates. To store your new refspecs, you add
fetch configuration entries to the
[remote "bob"] section.
To allow for multiple renamings, you make sure Git reads all the
fetch = lines, so that someone can write:
[remote "bob"] fetch = +refs/heads/asteroid_433:refs/remotes/bob/paris fetch = +refs/heads/master:refs/remotes/bob/master
and so on. But in practice, most remotes ended up with just one
(Allowing multiple lines is good forethought, as it turns out that today, we sometimes want to add refspecs beginning with
+refs/notes/... so that we can bring over Git notes. You can't know this yet, but obviously you're pretty smart, coming up with remote-tracking branches in the first place. :-) )
Of course, the old
branch.paris.merge syntax has to stick around around, because you can't change existing Git users' configurations. So now Git must, when it goes to use this
merge value, map the value through the same
fetch refspecs in order to figure out the correct remote-tracking branch name. (The old pull script didn't bother—it just got the value from
FETCH_HEAD directly. That script has somewhat recently been rewritten as a C program, and it is no longer obvious what it does. The
rebase commands, when run standalone, do in fact do this mapping, as they must.)
I should mention the leading
+ here as well. This is simply the force flag for the given refspec. It's the equivalent of
git push --force: it means "this reference should be updated no matter what", as compared to the more usual rule for branch name updates, which are allowed if the update adds new commits, but rejected if the update would "lose" existing commits (e.g., if you're picking up an upstream
git reset of some sort). Normally every
fetch = refspec for each remote has the leading
+, since you always want all your remote-tracking branches updated to whatever commit their branch says is the correct one right now.
2Actually, it already existed, for
git push purposes, because
git push always needed it. But it got refined somewhat.