Bhavya Arora Bhavya Arora - 1 year ago 38
Bash Question

bash Join command, Leaving out a row of numbers

I have two files, I want to take out the rows which have common data in the third column. But it is leaving out a row which should be matched.


b b b
4 5 3
c c c


1 2 3 4
a b c d
e f g h
i j k l
l m n o

The output is:

c c c a b d

The command used is:

join -1 3 -2 3 --nocheck-order File1.txt File2.txt

It is missing out the row with 3 as the common field, even after placing the --nocheck-order


Expected output:

c c c a b d
3 4 5 1 2 4

Answer Source

As an alternative to 2 sort commands (can be very expensive for big files) and then a join, you can use this single awk command to get your output:

awk 'FNR == NR{a[$3]=$0; next} $3 in a{print $3, a[$3], $1, $2, $4}' file1 file2

3 4 5 3 1 2 4
c c c c a b d


NR == FNR {                  # While processing the first file
  a[$3] = $0                 # store the whole line in array a using $3 as key

$3 in a {                    # while processing the 2nd file, when $3 is found in array
  print $3,a[$3],$1,$2,$4    # print relevant fields from file2 and the remembered
                             # value from the first file.