dimitris93 dimitris93 - 4 months ago 31
Git Question

GitPython "blame" does not give me all changed lines

I am using GitPython. Below I print the total number of lines changed in a specific commit:


from git import Repo

repo = Repo("C:/Users/shiro/Desktop/lucene-solr/")

sum_lines = 0
for blame_commit, lines_list in repo.blame('HEAD', 'lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java'):
if blame_commit.hexsha == 'f092795fe94ba727f7368b63d8eb1ecd39749fc4':
sum_lines += len(lines_list)
print sum_lines

The output is 38. However, if you simply go to https://github.com/apache/lucene-solr/commit/f092795fe94ba727f7368b63d8eb1ecd39749fc4 and look at the commit yourself for file
, the actual number of lines changed is not 38 but it is 47. Some lines are completely missing.

Why am I getting a wrong value ?


git blame tells you which commit last changed each line in a given file.

You're not counting the number of lines changed in that commit, but rather the number of lines in the file at your current HEAD that were last modified by that specific commit.

Changing HEAD to f092795fe94ba727f7368b63d8eb1ecd39749fc4 should give you the result you expect.

$ git blame f092795fe94ba727f7368b63d8eb1ecd39749fc4 ./lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java | grep f092795 | wc -l
$ git blame master ./lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java | grep f092795 | wc -l