pauloss pauloss - 5 months ago 17
Linux Question

Bash join over multiple fields

I have two files that look like this :

file1

a b 1 1
b d 2 3


file2

a 10 11
b 20 21
d 30 31


And I would like to join them in order to have the following output, which is the first file with what's written in file2 for first two fields:

a b 1 1 10 11 20 21
b d 2 3 20 21 30 31


I tried to use join, but I can't manage to join files according to the first two fields of file1.

Answer

Since you want to join on two lines, you'll need to join twice, piping the stdout of the first to the stdin of the second:

join -11 -21 file1 file2 | join -12 -21 - file2

Edit: Ah shoot, that reverses the order of the first two fields, is that ok?

b a 1 1 10 11 20 21
d b 2 3 20 21 30 31

Edit 2: This might be better -- if you reverse the order of the joins, you'll get the 1st two columns in the right order but the joined columns will be swapped:

join -12 -21 file1 file2 | join -12 -21 - file2

Yields:

a b 1 1 20 21 10 11
b d 2 3 30 31 20 21