lightsnail lightsnail - 3 months ago 6
Linux Question

Select rows in one file based on specific values in the second file (Linux)

I have two files:

One is "total.txt". It has two columns: the first column is natural numbers ranging from 1 to 20, the second column contains random numbers.

1 321
1 423
1 2342
1 7542
2 789
2 809
2 5332
2 6762
2 8976
3 42
3 545
... ...
20 432
20 758


The other one is "index.txt". It has three columns.

1 400 5000
2 800 7000
7 600 3000
11 300 4000
12 100 5000


I want to output the rows of "total.txt" file with first column matches with the first column of "index.txt" file. And at the same time, the second column of output results must be larger than (>) the second column of the "index.txt" and smaller than (<) the third column of the "index.txt".

The expected result is as follows:

1 423
1 2342
2 809
2 5332
2 6762
7 ...
7 ...
7 ...
11 ...
11 ...
12 ...
12 ...


I have tried this:

awk '$1==(awk '{print $1; exit}' index.txt) && $2>(awk '{print $2; exit}' index.txt) && $2<(awk '{print $3; exit}' index.txt) total.txt' > result.txt


But it failed. Could you help me with that? Thanks!

Answer

You need to read both files in the same awk script. When you read index.txt, store the other columns in an array.

awk 'FNR == NR { low[$1] = $2; high[$1] = $3; next }
    $2 > low[$1] && $2 < high[$1] { print }' index.txt total.txt

FNR == NR is the common awk idiom to detect when you're processing the first file.