user2531569 user2531569 - 5 months ago 11
Bash Question

Comparing field data line by line in python or Shell

I have input file with below data:

Mode|Date|Count|timestamp|status
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....


Now I am trying to compare every two lines to find out which field has data mismatch. I tried in quite few ways using python script. But didn't found any luck. My output should be as below

Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and Status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failure
....


Could someone please help me on this? Thanks in Advance

Answer

You can use awk:

awk 'BEGIN{
   FS="|"
}
NR==1 {
   split($0, h, /\|/)
   next
}
NR%2==0 {
   pr=$0
   split($0, a, /\|/)
   next
}
{
   s = ""
   for(i=1; i<=NF; i++)
      if ($i != a[i])
         s = sprintf("%s%s", s, (!s? "" : " and ") h[i])
   print s, "Mismatch:" ORS pr ORS $0
}' file

Count Mismatch:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete
HR|06/08/2016|3010|Thu Jun 09 2016|Complete
timestamp Mismatch:
HR|06/08/2016|2000|Thu Jun 09 2016|Complete
HR|06/08/2016|2000|Fri Jun 09 2016|Complete
Count and status Mismatch:
HR|06/08/2016|1000|Thu Jun 09 2016|Complete
HR|06/08/2016|1500|Thu Jun 09 2016|Failur
Comments