Spacemoose Spacemoose - 1 year ago 81
Git Question

How can I uniquely identify a git repository

I would like to create a tool that checks if I already have a local clone of a remote repository before cloning said repository. To do this, I need a way of testing if B is the same as repository A -- by which I guess i mean they have mergeable histories. B might be named differently than A, and might have additional branches -- the usual use cases.

Is there a way to do this? I have a tentative idea how to do it, but I thought perhaps someone here has a definitive answer.

Tentative idea

Get a list of branches and search for common branches (by hash). Then for the common branches, check that the initial commits are the same (by hash). At that point I would say 'good enough'. I figure I'm okay unless someone has been messing with history, which use-case I'm willing to neglect. To do this though, I need a way of getting the branch and commit information from the remote repository, without doing a clone. I can solve this using ssh & bash, but a git-only solution would be preferable.

Feedback on the half-baked idea is also welcome.

Why this is not a duplicate of Git repository unique id

The referenced question is looking for a unique repository id, or a way of creating one. No such beast exists, and even if it did, it is questionable if it would be relevant here, since I want to determine if two repositories have mergeable histories (i.e. I could fetch and merge between the two) -- a slightly better defined problem. I'm willing to ignore the possibilty that a user has modified history, but would love to hear how to handle that case as well.

Answer Source

As you can see in the related question; there is NO unique identification for a git repository. However; you could just compare the SHA-1 of the first commit on the master branch; that should suffice in 99.999% of all cases (supposing that the first commit will never be changed).

And if you want to be even more sure, you could consider using also the SHA-1 of the second commit; again supposing it will never change :). with the SHA-1 of the first two commits; I guess you have about 1 / 2^320 = 4.7*10^-97 chance of being wrong ...

If you are not sure there is even a master branch; you could suppose you have only one parentless root commit, and take its SHA-1. You can use this command to get the root commit (or commits):

git rev-list --parents HEAD | egrep "^[a-f0-9]{40}$"

( copied from this answer)

or (easier to understand, thanks @TomHale):

git rev-list --parents HEAD | tail -1
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download