domi771 domi771 - 2 months ago 9
Linux Question

Linux command line remove files recursively which dont exist in other folder

I have the following folder structure in folder /1:

/1/1/
1.png
2.png
5.png
6.png

/1/2/
3.png
4.png

/1/3/
10.png
11.png
14.png


there are subfolders 1-3 in this example. in real live its hundreds of folders. each subfolder contains an unknown amount of png files in it.

then i have a folder /2 which has the exact same subfolder structure but more images in it then folder /1:

/2/1/
1.jpg
2.jpg
3.jpg
4.jpg
5.jpg
5.jpg

/2/2/
1.jpg
2.jpg
3.jpg
4.jpg

/2/3/
10.jpg
11.jpg
12.jog
13.jpg
14.jpg


Please note that different file extension in folder 2 (.jpg). What the files have in common is only the file name. the extension is different in folders /1 and /2.

What i try to achieve in Linux is to clean folder /2 and have there only the images which exist a file in folder /1 with the same name.

Can anybody provide me with a command i can use from the command line or a bash script?

The final result in folder /2 should be:

/2/1/
1.jpg
2.jpg
5.jpg
6.jpg

/2/2/
3.jpg
4.jpg

/2/3/
10.jpg
11.jpg
14.jpg


Thank you!

Answer

Here's a way to do this with find and a simple while loop in bash:

cd /path/to/2 || exit 1
find -type f -name '*.jpg' -print0 |
    while IFS= read -r -d '' path; do
        if [[ ! -e "/path/to/1/${path%.jpg}.png" ]]; then
                  # ^^^^^^^^^^ adjust this path
            echo rm -- "$path"
           #^^^^ remove this after the first dry-run
        fi
    done

Run it once to echo if it would remove the correct files, and if everything looks ok, run it again having removed the echo from rm.