LamaEu LamaEu - 6 months ago 31
Bash Question

Sorting by date with variable number of columns

I want to sort lines consisting of dates, but I'm having trouble trying to figure out how to sort the lines and keep the lines whole. I also don't understand how to use pipe to sort the lines.

For example, my script receives this as a text file:

asdsa 24 asdsa 3 3000 054217542 30.3.2016
asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016


I would like to read line by line:

while read line; do

done < "$1"


And inside sort the lines by their dates. How can I sort the lines as a they are in a file, while I read them one by one?

What if I do this:

#!/bin/bash

PATH=${PATH[*]}:.
#filename: testScript


while read line; do
arr=( $line )
num_of_params=`echo ${#arr[*]}`
echo $line | sort -n -k$num_of_params

num_of_params=0
done < "$1"


My problem with this is that I actually send each line by its own to sort, and not the lines all together, but I don't know any other other way to do it (without using temp files, I'm not looking to use any of these).

Output:

asdsa 24 asdsa 3 3000 054217542 30.3.2016
asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016


Desired output:

asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016
asdsa 24 asdsa 3 3000 054217542 30.3.2016


As you can see, it didn't work.

How can I fix that?

Answer

Here is a solution using a Schwartzian transform with awk and cut:

awk '{split($NF,arr,"."); printf("%d%02d%02d\t%s\n",arr[3],arr[2],arr[1],$0)}' infile |
sort -k 1,1 | cut -f 2-

The awk part first splits the last field of the record, $NF (the date), at the periods into an array arr:

split($NF,arr,".")

The second part prints the line with the reformatted date prepended: first the year, then the month and the day, the latter two with zero padding to two digits:

printf("%d%02d%02d\t%s\n",arr[3],arr[2],arr[1],$0)

The output of this looks as follows:

20160330        asdsa 24 asdsa 3 3000 054217542 30.3.2016
20140102        asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
20160306        dsasda 23 dsada 4 3200 537358234 6.3.2016

Now we can just pipe to sort and use the first field:

sort -k 1,1

resulting in

20140102        asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
20160306        dsasda 23 dsada 4 3200 537358234 6.3.2016
20160330        asdsa 24 asdsa 3 3000 054217542 30.3.2016

And finally, we remove our inserted field again with cut, leaving only everything from the second field on:

cut -f 2-

resulting in

asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016
asdsa 24 asdsa 3 3000 054217542 30.3.2016
Comments