Stefan Schouten Stefan Schouten - 1 month ago 10
Git Question

How do I know if a git commit has been changed?

Someone has committed something a few months ago. After that, multiple other commits have been done. Is it possible to see if someone has changed the contents of that certain commit by amending or by rebasing? If yes, how?


A commit, in Git, is never changed. Neither rebase nor git commit --amend ever change any commit, as this is not possible.1

The trick here lies in defining "a commit". How do you know which commit is which? If I say "a commit in the Git repository for Git", well, there are over 40,000 commits in there. Which one do I mean?

The unambiguous and definite way for me to tell you is for me to give you the hash ID, e.g., 9b7cbb315923e61bb0c4297c701089f30e116750. That is the true name for one specific commit:

$ git cat-file -p 9b7cbb315923e61bb0c4297c701089f30e116750 | sed 's/@/ /'
tree 4ba58c32960dcecc1fedede9c9362f5c10158f08
parent 77933f4449b8d6aa7529d627f3c7b55336f491db
author Junio C Hamano <gitster> 1418845774 -0800
committer Junio C Hamano <gitster> 1418845774 -0800

Git 2.2.1

Signed-off-by: Junio C Hamano <gitster>

This name is permanently attached to this particular commit. It sure is an unwieldy and ugly name, though. Wouldn't it be nice to have a shorter, prettier, wieldy name? And there is one: I can point you to v2.2.1:

$ git rev-parse v2.2.1^{commit}

But in fact, v2.2.1 is not a commit at all, it's a tag. Specifically, it is a tag name (found in refs/tags/v2.2.1 or in the packed-refs file under the name v2.2.1) pointing to an annotated tag object,2 rather than directly to a commit:

$ git rev-parse v2.2.1

The tag object has the commit ID inside it, plus a whole bunch of additional goop, including a "PGP signature":

$ git cat-file -p v2.2.1 | sed 's/@/ /'
object 9b7cbb315923e61bb0c4297c701089f30e116750
type commit
tag v2.2.1
tagger Junio C Hamano <gitster> 1418851265 -0800

Git 2.2.1
Version: GnuPG v1


The PGP signature is what lets us decide whether we believe Junio C Hamano really made and signed this tag. It uses a stronger form of encryption digital signature than SHA-1 (which is good since SHA-1 is, at least in theory, breakable) that also supports both distributed verification, and the ability to revoke signatures (which SHA-1 itself does not).

In the end, though, that only helps us if someone we trust and/or can verify has made such a PGP-signed tag, or has PGP-signed a commit. In theory, signing each commit might be a bit stronger since then there's a digital signature directly on the commit; but in practice, signing tags is much more convenient, and just as good since we don't regularly go about breaking SHA-1 (and, at least with current brute-force methods, it would leave obvious marks if we did, though that's way beyond the scope of this answer, and also somewhat beyond me to describe properly—cryptography is not my field).

1Well, it's theoretically possible if you can break the SHA-1 hash. The way Git behaves if you come up with a new, different object that nonetheless produces the same hash means you won't ever pick up this new object if you already have the old one, though. This rule applies to all Git objects (commits, trees, annotated tags, and blobs), all of which are named by their hashes.

What git rebase and git commit --amend do, to make it seem like they changed commits, is to make new copies of existing commits, and then shuffle the names around. The new commits have new, different hashes, and since a later (descendant) commit literally contains the hash of its immediate ancestor (parent) commit, "changing" one commit's hash (i.e., copying the commit object to a new, different commit object) forces the change to bubble down through the rest of the commits. We then re-point the existing (short, branch or tag) name to the tip of the new chain.

This is why, given an end-point that we believe is trust-able, we can extend that trust to each previous object in the chain or tree. The technical term for this is a Merkle tree.

2This makes it what Git calls an "annotated tag": a tag name (which by itself would be a "lightweight tag") pointing to an annotated-tag object, stored in the Git repository, with the tag object pointing to some other Git object—usually a commit, but perhaps another tag, or even a tree or a blob. However, even "another tag" is somewhat rare—there are just three of these in the Git repository for Git—and the other two are practically unheard-of.