Janaranjan Janaranjan - 15 days ago 5
Linux Question

How to compare files with different columns in unix?

I want to compare filenames of Today.txt with Main.txt.
If there is match, then print all 6 columns of matched file of Main.txt with a new file say matched.txt.

and the files which are not matched with Main.txt, then list the filename and time of TODAY.txt in a new file say unmatched.txt

Main.txt

date filename timestamp space count status
Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time
Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24 On_Time
Nov 4 AR02_20161104.txt 09:31 0.00M 7 On_Time
Nov 4 AR01_20161104.txt 09:31 0.04M 433 On_Time


Today.txt

filename time
CHCK01_20161104.txt 06:03
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
AR01_20161104.txt 09:36
AR02_20161104.txt 09:36
ifs01_20161104.txt 21:16
TRIPS11_20161104.txt 09:16


Required Output:
matched.txt

Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time
Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24 On_Time
Nov 4 AR02_20161104.txt 09:31 0.00M 7 On_Time
Nov 4 AR01_20161104.txt 09:31 0.04M 433 On_Time


unmatched.txt

CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt 21:16


Could you please help me on this please ?

Thanks a lot in advance !

Answer

awk to the rescue!

$ awk 'FNR==1{next} 
      NR==FNR{a[$1]=$2; next} 
      $3 in a{print; delete a[$3]} 
          END{for(k in a) print k,a[k] > "unmatched"}' today main > matched

$ head *matched

==> matched <==
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

==> unmatched <==
ifs01_20161104.txt 21:16
CHCK09_20161104.txt 21:46
CHCK05_20161104.txt 11:10