Ali Jaber Ali Jaber - 14 days ago 5
Bash Question

reformatting text file from rows to column

i have multiple files in a directory that i need to reformat and put the output in one file, the file structure is:

========================================================
Daily KPIs - DATE: 24/04/2013
========================================================

--------------------------------------------------------
Number of des = 5270
--------------------------------------------------------
Number of users = 210
--------------------------------------------------------
Number of active = 520
--------------------------------------------------------
Total non = 713
--------------------------------------------------------

========================================================


I need the output format to be:

Date,Numberofdes,Numberofusers,Numberofactive,Totalnon
24042013,5270,210,520,713


The directory has around 1500 files with the same format and im using Centos 7.

Thanks

Answer

First we need a method to join the elements of an array into a string (cf. Bash: Join elements of an array?):

function join_array()
{
    local IFS=$1
    shift
    echo "$*"
}

Then we can cycle over each of the files and convert each one into a comma-separated list (assuming that the original file have a name ending in *.txt).

for f in *.txt
do
    sed -n 's/[^:=]\+[:=] *\(.*\)/\1/p' < $f | {
        mapfile -t fields
        join_array , "${fields[@]}"
    }
done

Here, the sed command looks inside each input file for lines that:

  1. begin with a substring that contains neither a : nor a = character (the [^:=]\+ part);
  2. then follow a : or a = and an arbitrary number of spaces (the [:=] * part);
  3. finally, end with an arbitrary substring (the *\(.*\) part).

The last substring is then captured and printed instead of the original string. Any other line in the input files is discared.

After that, the output of sed is read by mapfile into the indexed array variable fields (the -t ensures that trailing newlines from each line read are discarded) and finally the lines are joined thanks to our previously-defined join_array method.

The reason whereby we need to wrap mapfile inside a subshell is explained here: readarray (or pipe) issue.