user2607210 user2607210 - 1 month ago 8
Linux Question

Comparing two flat files in linux (bash), taking what is missing and putting it in the first file?

I am currently trying to compare two flat files in bash. The first file will have three columns separated by | and the second will have two columns separated by |. I want to take the input that is missing from the second file and put it into the first. I am only concerned with taking over the two missing columns from file 2 to file 1.

example files

file one:


a|blue|3

b|yellow|1

c|green|2


file two:


a|blue

b|yellow

c|green

d|purple


Output file:


a|blue|3

b|yellow|1

c|green|2

d|purple

Answer Source

This should work:

# Set the input field separator to "|"
awk -F'|' '

# Load the second file into an array called "a". NR==FNR allows us to perform this action
# until first file is complete 
NR==FNR { a[$0]; next }

# We check the existence of first and second column of first file in array. If it is present
# we delete that array element. 1 at the end allows us to print the line from first file as is. 
($1 FS $2 in a) { delete a[$1 FS $2] }1

# This action takes place at the very end. Whatever is left in our array we iterate through
# and print it. This can cause the output to appear in any order hence sort is needed. 
END { for (l in a) print l }' f2 f1

Output:

$ head f*
==> f1 <==
a|blue|3
c|green|2
b|yellow|1

==> f2 <==
a|blue
c|green
b|yellow
d|purple

$ awk -F'|' '
NR==FNR { a[$0]; next }
($1 FS $2 in a) { delete a[$1 FS $2] }1
END { for (l in a) print l }' f2 f1
a|blue|3
c|green|2
b|yellow|1
d|purple