Koushik Chandra Koushik Chandra - 2 months ago 18
Bash Question

finding maximum from partial string

I have a list where first 6 digit is date in format yyyymmdd. The next 4 digits are part of timestamp. I want to select only those numbers which are maximum timestamp for any day.

20160905092900
20160905212900
20160906092900
20160906213000
20160907093000
20160907213000
20160908093000
20160908213000
20160910093000
20160910213100
20160911093100
20160911213100
20160912093100


Means from the above list the output should give the below list.

20160905212900
20160906213000
20160907213000
20160908213000
20160910213100
20160911213100
20160912093100

Answer

You can use awk:

awk '{
   dt = substr($0, 1, 8)
   ts = substr($0, 9, 12)
}
ts > max[dt] {
   max[dt] = ts
   rec[dt] = $0
}
END {
   for (i in rec)
      print rec[i]
}' file    

20160905212900
20160906213000
20160907213000
20160908213000
20160910213100
20160911213100
20160912093100

We are using associative array max that uses first 8 characters as key and next 4 characters as value. This array is being used to store max timestamp value for a given date. Another array rec is used to store full line for a date when we encounter timestamp value greater than stored value in max array.

Comments