zara zara - 4 months ago 7
Linux Question

how to extract rows from a file based on one column in another file in linux?

I have two files that look like this:

file 1:
HO840F3000336240 HOUSAM129901651 HOUSAF132871174 F 20060607 Yes
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000336254 HOUSAM129901651 HOUSAF135357862 F 20060724 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes

file 2:
HO840F3000336251
HO840F3000487279


What I would like to do is to extract those rows from File 1 where first column is common with the first column in File 2. So, based on this example, the output should be :

file3:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes


any suggestion?

Answer

UPDATE:

This command will create file3 with the desired output. Tested and works:

cat file1 | grep -f file2 > file3

Output:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes

It uses the -f switch in grep which takes a file name with one pattern per line. As per man grep:

    -f FILE, --file=FILE
Obtain patterns from FILE, one per line.  The empty file contains zero patterns, 
and therefore enter code here`matches nothing.  (-f is  specified by POSIX.)