KcFnMi KcFnMi - 23 days ago 7
Git Question

track the same files in multiple repositories

Let's say I have

repo1
being tracked (it contains .git).

Then, inside of it, I download
project1
from github. This has it's own .git folder (which I might ignore?).

Proceeding, I'll add
project1
to
repo1
.

At this point, I'll say files in
project1
are being tracked by multiple repositories.

I did some tests here without any issues, apparently this situation can be carried on. I can commit in both repositories. Apparently everything is still in order.

Did I forget anything? Is this someway dangerous?

Answer

It's not a problem for Git. It may let you confuse yourself, and/or let you do things you did not intend.

The key is to keep track of three things: the repository itself, the work-tree (usually one per repository), and the index (for that repository or work-tree).1

Git will ignore all files within any .git sub-directories.2 You have some top level repo1 directory (and repository) that contains another directory project1 (that is its own repository), hence we can be sure that repo1/project1/.git/HEAD exists, and repo1/project1/.git/refs/heads/master probably exists too; but Git automatically ignores them. However, as you saw, Git does not automatically ignore other files. That is, from within repo1, files within project1—such as repo1/project1/README—are seen as a valid path name project1/README, which can be either tracked (in the index for repo1) or untracked (not in that index).

If these files are showing up as untracked, and you try to use git status to view them, you will see only the directory-as-a-whole (i.e., project1/), unless you ask for -uall (or --untracked-files=all). Usually most people hide the entire directory away with a .gitignore so that its files neither show up as untracked, nor accidentally get added.

Meanwhile, any time you do an operation while sitting at some level of this nesting hierarchy, the Git "at that level" will catch it. For instance, if you change your current working directory into repo1 and run git status, you will examine the status with respect to repo1, but if you then change it to project1 and run git status, you will examine the status with respect to project1.

If project1 has a subdirectory of its own (repo1/project1/sub/) that does not have a .git directory, operations done in that sub-directory are "in" project1.

In other words, unless you give it extra instructions, Git starts from where you are now and checks for .git. If there is none here, it climbs up one directory and tries again. It repeats until it runs out of possible places to look (there's some special case code to avoid climbing out of a file system, so "possible places" may not continue up to /; this is OS-dependent). Once it finds thae "top level" with a .git, it stops climbing. To see where it stopped, run:

$ git rev-parse --show-toplevel

Wherever it stopped, that's where the work-tree is.3 That's usually (but not always) where the repository itself is as well:

$ git rev-parse --git-dir

(which may show a relative or absolute path).

With this kind of nesting, you will need to be (painfully :-) ) aware of which repository you're working with, because each repository—or, more precisely, some work-tree associated with the repository, plus the repository and index as necessary—will get touched by various Git commands. But this "touching" won't inform any inner Git repositories in any way. For instance, if you run git checkout otherbranch, Git will modify files in your current work-tree by switching branches, and then modify the HEAD commit in your current repository. If this work-tree overlaps the work-tree of some other repository, and you move yourself into that other repository, suddenly all those (changed) files no longer match your current repository's HEAD commit.

When work-trees overlap like this, people make mistakes. Git won't care; it's just dealing with a repository, a work-tree, and an index.


1In older versions of Git there is only one work-tree for a repository. If you use the new git worktree add, you can have more than one work-tree, and each will have its own index. In the special case of having no work-tree (footnote 3), there's still one index.

2There used to be some security-related issues on systems that folded case, because you could have repositories that had files named .GiT/hooks/pre-commit or somedir/.gIT/hooks/pre-commit for instance, which would overwrite your top or sub-level repositories' hooks. Modern Git automatically ignores .Git, .giT, .GIt, and so on as well.

3Assuming there is a work-tree at all, that is. If the repository is "bare" (core.bare is set, and not overridden), then there is no work-tree and all these questions basically just vanish.

Comments