ddn ddn - 6 months ago 15
Bash Question

How to delete one set of files in a directory containing similarly named files?

A series of several hundred directories contains files in the following pattern:

Dir1:
-text_76.txt
-text_81.txt
-sim_76.py
-sim_81.py

Dir2:
-text_90.txt
-text_01.txt
-sim_90.py
-sim_01.py


Within each directory, the files beginning with text or sim are essentially duplicates of the other text or sim file, respectively. Each set of duplicate files has a unique numerical identifier. I only want one set per directory. So, in Dir1, I would like to delete everything in the set labeled either 81 OR 76, with no preference. Likewise, in Dir2, I would like to delete either the set labeled 90 OR 01. Each directory contains exactly two sets, and there is no way to predict the random numerical IDs used in each directory. How can I do this?

Answer

Assuming you always have 1 known file, say text_xx.txt then you could run this script in each sub-directory:

ls text_*.txt | { IFS= read -r first; rm *"${first:4:4}"*; };

This will list the first file matching the pattern text_*.txt, then it will take the characters _xx. from the middle of that first ls result and use them in a remove statement: rm *_xx.*.

This should remove one "fileset" every time it is run in a given sub-directory.

Note: If there is only 1 fileset, it would still remove the fileset.