diborbi diborbi - 5 months ago 17
Linux Question

Compare columns from two files and print not match

I want to compare the first 4 columns of file1 and file2. I want to print all lines from file1 + the lines from file2 that are not in file1.

File1:
2435 2 2 7 specification 9-8-3-0
57234 1 6 4 description 0-0 55211
32423 2 44 3 description 0-0 24242

File2:
2435 2 2 7 specification
7624 2 2 1 namecomplete
57234 1 6 4 description
28748 34 5 21 gateway
32423 2 44 3 description
832758 3 6 namecomplete

output:
2435 2 2 7 specification 9-8-3-0
57234 1 6 4 description 0-0 55211
32423 2 44 3 description 0-0 24242
7624 2 2 1 namecomplete
28748 34 5 21 gateway
832758 3 6 namecomplete


I don't understand how to print things that don't match.

Answer

You can do it with an awk script like this:

script.awk

FNR == NR { mem[ $1 $2 $3 $4 $5 ] = 1; 
            print
            next
           }

           { key = $1 $2 $3 $4 $5
             if( ! ( key in mem) ) print
           }

And run it like this: awk -f script.awk file1 file2 .

The first part memorizes the first 5 fields, prints the whole line and moves to the next line. This part is exclusively applied to lines from the first file.

The second part is only applied to lines from the second file. It checks if the line is not in mem, in that case the line was not in file1 and it is printed.