Aᴍɪʀ Aᴍɪʀ - 5 months ago 52
Git Question

Running git diff-tree with --numstat and --name-status

I'm writing a script to analyze changes have been made into a git repo.
At some point I need to iterate over all the commits and obtain these information about each of them:

  • Commit ID

  • Date

  • Commit Message

  • ...

  • Files changed

    • File Name

    • Type of change (Added/Modified/Removed/Renamed)

    • New File Name (in case the change type is "Renamed")

    • Number of lines added

    • Number of lines removed

I get the commit messages and dates by
git log
. The issue I have is with the files.

If I don't want to collect number of lines added/removed, I'd simply use

git diff-tree --no-commit-id --name-status -M -r abcd12345

The output would be something like

A Readme.md
M src/something.js
D src/somethingelse.js
R100 tests/a/file.js tests/b/file.js

Which I can parse and read programmatically.

To get information about lines added/removed, I could use this:

git diff-tree -M -r --numstat abcd12345

The output would be like:

82 0 Readme.md
41 98 src/something.js
0 64 src/somethingelse.js
0 0 tests/{a => b}/file.js

Which is not that machine readable for renamed files.

My question is: Is there any way to combine these two commands? It seems I can't use

I can run two separate command and merge the result in my script as well. In that case, is there any other switches that I can use to make the result of the second command more machine readable?



I think your analysis (that you need two separate commands) is correct. Use -z to obtain machine-readable output with --numstat (this disables both fancy rename encoding and all special-character-quoting), but note that you will then have to break lines apart at ASCII NULs instead of newlines.