David James David James - 3 months ago 18
Git Question

Git Commit Messages : 50/72 Formatting

Tim Pope argues for a particular git commit message style in his blog post:
http://www.tpope.net/node/106

Here is a quick summary of what he recommends:


  • First line is 50 characters or less

  • Then a blank line

  • Remaining text should be wrapped at 72 characters



His blog post gives the rationale for these recommendations (which I will call "50/72 formatting" for brevity):


  • In practice, some tools treat the first line as a subject line and the second paragraph as a body (similar to email)

  • git log
    does not handle wrapping, so it is hard to read if lines are too long.

  • git format-patch --stdout
    converts commits to email -- so to play nice it helps if your commits are already wrapped nicely.

  • a point I would like to add that I think Tim would agree with: the act of summarizing your commit is a good practice inherently in any version control system. It helps others (or a later you) find relevant commits more quickly.



So, I have a couple of parts to my question:


  • What chunk (roughly) of the 'thought leaders' or 'experienced users' of git embrace the 50/72 formatting style? I ask this because sometime newer users don't know or don't care about community practices.

  • For those that don't use this formatting, is there a principled reason for using a different formatting style? (Please note that I'm looking for an argument on the merits, not "I've never heard of it" or "I don't care.")

  • Empirically speaking, what percentage of git repositories embrace this style? (In case someone wants to do an analysis on GitHub repositories... hint, hint.)



My point here is not to recommend the 50/72 style or shoot down other styles. (To be open about it, I do prefer it, but I am open to other ideas.) I just want to get the rationale for why people like or oppose various git commit message styles. (Feel free to bring up points that haven't been mentioned, too.)

Answer

Regarding the "summary" line (the 50 in your formula), the Linux kernel documentation has this to say:

For these reasons, the "summary" must be no more than 70-75
characters, and it must describe both what the patch changes, as well
as why the patch might be necessary.  It is challenging to be both
succinct and descriptive, but that is what a well-written summary
should do.

That said, it seems like kernel maintainers do indeed try to keep things around 50. Here's a histogram of the lengths of the summary lines in the git log for the kernel:

Lengths of git summary lines (view full-sized)

There is a smattering of commits that have summary lines that are longer (some much longer) than this plot can hold without making the interesting part look like one single line. (There's probably some fancy statistical technique for incorporating that data here but oh well... :) ).

If you want to see the raw lengths:

cd /path/to/repo
git shortlog  | grep -e '^      ' | sed 's/[[:space:]]\+\(.*\)$/\1/' | awk '{print length($0)}'

or a text-based histogram:

cd /path/to/repo
git shortlog  | grep -e '^      ' | sed 's/[[:space:]]\+\(.*\)$/\1/' | awk '{lens[length($0)]++;} END {for (len in lens) print len, lens[len] }' | sort -n