thanksd thanksd - 3 months ago 8
Git Question

Git find stale files

I've started maintaining a large, unwieldy, repo with a lot of outdated code that could be deprecated or removed. So, I want to find files that have not been changed since a specific commit so that I can go through and check if they're still necessary.

I can find files in my repo that have changed since a specific commit via:

git diff --name-only SHA


But how do I find files that have not been changed?

Answer

The shell utility comm is seriously under-appreciated.

$ git diff --name-only $SHA | LC_ALL=C.UTF-8 sort > /tmp/A
$ git ls-tree -r --full-tree --name-only HEAD | LC_ALL=C.UTF-8 sort > /tmp/B
$ LC_ALL=C.UTF-8 comm -13 /tmp/A /tmp/B

will produce the list you want, by subtracting the set of all changed files from the set of all files. (It's a little finicky, though, hence all the LC_ALL overrides. If you get error messages about C.UTF-8 try just C instead.)