RufM RufM - 5 months ago 13
Bash Question

Batch script to remove parts of a filename within variable characters

Help please!

I have a set of 60 files named in the following format:

  • XXXXX_L2_R1_001_XneCgnfdkjTTTnm.fastq.gz

  • XXXXX_L2_R2_001_GmnbkjZZnvhkfPn.fastq.gz

and I would like to remove the "_L2" part and everything else after the third underscore, in order to have something like:

  • XXXXX_R1.fastq.gz

  • XXXXX_R2.fastq.gz

The number "XXXXX" varies between the files, and for each number there is always a R1 file and a R2 file.

Maybe a rename or a sed command can help.



Using rename utility you can do this:

rename -n 's/^([^_]+)_L2(_[^_]+)[^.]+(\..+)$/$1$2$3/' *.gz

Once satisfied with dry run remove -n option and rerun.