user2765315 user2765315 - 2 months ago 6
Bash Question

unix shell : create different tar files using a string in a array

We have services that generate files named with modules names. For ex., acqDou module generates

acqDou_0001.out
acqDou_08981.out
acqDou_23423.out


The
acq_cheat
module generates files with almost same name, but last numbering is a bit diffrent.

My Requirment is to tar these files on the basis of module names, one tar file per module, such as
acqDou.tar
and
acq_cheat.tar


Below is the programm I attmepted:

cd /home/swap/output/outfiles
for i in *;
do
j=`echo $i | grep -v 'out$'`
if [ -z $j ];
then continue;
fi
count1=$(echo $i | tr -d -c '_' | wc -m)
if [ $count1 -eq 2 ]
then
two=`ls -1 $i | cut -d'_' -f2 | cut -d'.' -f1 | uniq`
else
two=`ls -1 $i | cut -d'_' -f1 | cut -d'.' -f1 | uniq`
fi
_FILE="${_FILE} $two"
done
_FILE2=`echo "${_FILE[@]}" | tr ' ' '\n' | sort -u | tr '\n' ' '`
echo "${_FILE2}"
for m in "${_FILE2}"
do
ls -lrt *${m}*.x
tar -cpf $m.tar /home/swap/output/outfiles/*${m}*
done
}

Answer

Make tars for all possible module types present in the directory - in one shot.

lets say you have modules named as.. acqDou, acqDumb, acqcheat, acqfunny - In module_XXXX.out format.

something which may looks like this, as shown below. So i got 4 modules and 500 files of each of them in the directory. They all are empty, but it doesn't matter.

 >Wed Oct 05|01:15:13|gaurav@[STATION]:/root/ga/scripts/temp/tmp % ls -lrtha *.out|head -7 ; echo ; ls -lrtha |tail -8
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqDou_0.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqDumb_0.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqcheat_0.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqfunny_0.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqDou_1.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqDumb_1.out
-rw-rw-r--. 1 gaurav gaurav 0 Oct  5 00:58 acqcheat_1.out

-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqDumb_498.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqcheat_498.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqfunny_498.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqDou_499.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqDumb_499.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqcheat_499.out
-rw-rw-r--. 1 gaurav gaurav    0 Oct  5 00:58 acqfunny_499.out
drwxrwxr-x. 2 gaurav gaurav  64K Oct  5 01:14 .
 >Wed Oct 05|01:15:30|gaurav@[STATION]:/root/ga/scripts/temp/tmp %

In order to achieve the objective:

we can list the files, strip off the numbers (using sed), sort and unique the list (which gives the possible present module names in the directory) and then feed it to a while loop, to read module names one by one .. and then generate our output tarfile, module name wise and having files of that module only.

Here is the command:

ls *.out |sed 's/[0-9]//g'|sort|uniq|sed 's/_.out//g'|while read module
do 
    tar -cvf ${module}.tar ${module}*
done

We get something like this, once we finish.

 >Wed Oct 05|01:16:43|gaurav@[STATION]:/root/ga/scripts/temp/tmp % ls -lrth *.tar
-rw-rw-r--. 1 gaurav gaurav 260K Oct  5 01:16 acqcheat.tar
-rw-rw-r--. 1 gaurav gaurav 260K Oct  5 01:16 acqDou.tar
-rw-rw-r--. 1 gaurav gaurav 260K Oct  5 01:16 acqDumb.tar
-rw-rw-r--. 1 gaurav gaurav 260K Oct  5 01:16 acqfunny.tar
 >Wed Oct 05|01:16:48|gaurav@[STATION]:/root/ga/scripts/temp/tmp %

4 tar files have been created, with their module names. And it contains its respective module files. Let us see by using tar -tvf command. I will run it on all tar files and just take 2 lines as output, per file.

 >Wed Oct 05|01:18:29|gaurav@[STATION]:/root/ga/scripts/temp/tmp % ls *.tar|while read file
 do 
     echo ;echo "Looking inside file: $file"
     tar -tvf $file|head -2
 done

Looking inside file: acqcheat.tar
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqcheat_0.out
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqcheat_100.out

Looking inside file: acqDou.tar
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqDou_0.out
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqDou_100.out

Looking inside file: acqDumb.tar
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqDumb_0.out
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqDumb_100.out

Looking inside file: acqfunny.tar
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqfunny_0.out
-rw-rw-r-- gaurav/gaurav     0 2016-10-05 00:58 acqfunny_100.out
 >Wed Oct 05|01:18:42|gaurav@[STATION]:/root/ga/scripts/temp/tmp %

We do really have 500 files in each tar file. Let us just confirm that too.

 >Wed Oct 05|01:18:42|gaurav@[STATION]:/root/ga/scripts/temp/tmp % ls *.tar|while read file
 do 
     echo ; echo "File count in tar file: $file"
     tar -tvf $file|wc -l
 done

File count in tar file: acqcheat.tar
500

File count in tar file: acqDou.tar
500

File count in tar file: acqDumb.tar
500

File count in tar file: acqfunny.tar
500
 >Wed Oct 05|01:20:03|gaurav@[STATION]:/root/ga/scripts/temp/tmp %

Cheers, Gaurav