Bash Question

Removing duplicate files with same size in shell script

I have a directory which has the multiple files with same content but different names, the only criteria I thought of to remove the duplicates was to sort them based on size and then remove the ones having same size, for instance when I type

find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3

I get

./abc.txt - 595
./acd.txt - 595
./dbc.txt - 595
./jed.txt - 595
./end.txt - 595
./wtw.txt - 595
./hds.txt - 595
./dkd.txt - 523
./kjk.txt - 523

I would like to keep only


Answer Source
find . -type f -printf "%p - %s\n" | uniq -D -f1 | sort -nr -k3
  • uniq needs the input sorted, so you'd have to put sort before it.

  • The uniq option -D is out of place here.

  • The sort option -u can do the job of uniq.

find . -type f -printf "%p - %s\n" | sort -nru -k3
