DanielG DanielG - 4 months ago 9
Bash Question

grep for two patterns independently (in different lines)

I have some directories with the following structure:

DAY1/ # Files under this directory should have DAY1 in the name.
|-- Date
| |-- dir1 # Something wrong here, there are files with DAY2 and files with DAY1.
| |-- dir2
| |-- dir3
| |-- dir4
DAY2/ # Files under this directory should all have DAY2 in the name.
|-- Date
| |-- dir1
| |-- dir2 # Something wrong here, there are files with DAY2, and files with DAY1.
| |-- dir3
| |-- dir4


In each
dir
there are hundreds of thousands of files with names containing
DAY
, for example
0.0000.DAY1.01927492
. Files with
DAY1
on the name should only appear under parent directory
DAY1
.

Something went wrong when copying files around, so that I now have mixed files with
DAY1
and
DAY2
in some of the
dir
directories.

I wrote a script to find folders that contain mixed files, so I can then look at them more closely. My script is the following:

for directory in */; do
if ls $directory | grep -q DAY2 ; then
if ls $directory | grep -q DAY1; then
echo "mixed files in $directory";
fi ;
fi;
done


The problem here is that I'm going through all files twice, which doesn't make sense considering that I'd only have to look through the files once.

What would be a more efficient way achieve what I want?

Answer

If i understand you correctly, then you need to find the files under DAY1 directory recursively that have DAY2 in their names, similarly for DAY2 directory the files what have DAY1 in their names.

If so, for DAY1 directory:

find DAY1/ -type f -name '*DAY2*'

this will get you the files under DAY1 directory that have DAY2 in their names. Similarly for DAY2 directory:

find DAY2/ -type f -name '*DAY1*'

Both are recursive operations.


To get the directory names only:

find DAY1/ -type f -name '*DAY2*' -exec dirname {} +

Note that the $PWD will be shown as ..

To get uniqueness, pass the output to sort -u:

find DAY1/ -type f -name '*DAY2*' -exec dirname {} + | sort -u
Comments