zara zara - 4 months ago 12
Linux Question

how to extract rows from file1 based on matching information of its/file1 (only)first column with file2 in linux?

I have two files that look like this:

file 1:
HO840F3000336240 HOUSAM129901651 HOUSAF132871174 F 20060607 Yes
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000336254 HOUSAM129901651 HOUSAF135357862 F 20060724 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes
HOUSAF55969445 HOUSAM55967108 HOUSAF53579684 F 20120103 Yes

file 2:
HO840F3000336251
HO840F3000487279
HOUSAF135761935
HOUSAM55967108


What I would like to do is to extract those rows from File 1 where first column is common with the first column in File 2. So, based on this example, the output should be :

file3:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes


any suggestion?

Answer

UPDATE:

This command will create file3 with the desired output. Tested and works:

cat file1 | grep -f file2 > file3

Output:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes

It uses the -f switch in grep which takes a file name with one pattern per line. As per man grep:

    -f FILE, --file=FILE
Obtain patterns from FILE, one per line.  The empty file contains zero patterns, 
and therefore enter code here`matches nothing.  (-f is  specified by POSIX.)