Doej Doej - 6 months ago 23
Bash Question

how to check for the presence of two words in a file using GREP

I have two files A.txt and B.txt containing two lists respectively as shown bellow.

File A.txt

hello
hi
ko


File B.txt

fine
No
And how
why


Now I want to check presence of any of these words (from A.txt AND B.txt) in a line in another file C.txt.

I am using the grep command

grep -iof A.txt C.txt| grep B.txt


C.txt contains sentences containing words from A.txt and B.txt

Hello I am fine
I am not fine
why ko is and how?


doesn't show any output

So, now I want if any word from A.txt and B.txt present simultaneously in one sentence it should show the output as

Hello fine
why ko and how


To print only the matching words from both files if they occur simultaneously in C.txt, instead of printing the whole line from C.txt

Answer

You probably want to say:

$ grep -if B <(grep -if A C)
Hello I am fine
why ko is and how?

This uses -f to provide the expressions. It can be a file... or a file you create on the fly with the process substitution <( ... ).

Firstly, grep -if A C matches all the words in C that are in A:

$ grep -if A C
Hello I am fine        # "Hello" highlighted
why ko is and how?     # "ko" highlighted

Then, its output is compared with the content in B.

$ grep -if B <(grep -if A C)
Hello I am fine        # "fine" highlighted
why ko is and how?     # "and how" highlighted

Depending on your needs, you may want to add -F, -w and -i.

From man grep:

   -f FILE, --file=FILE
          Obtain  patterns  from  FILE,  one  per  line.   The  empty file
          contains zero patterns, and therefore matches nothing.   (-f  is
          specified by POSIX.)

   -F, --fixed-strings
          Interpret PATTERN as a  list  of  fixed  strings,  separated  by
          newlines,  any  of  which is to be matched.  (-F is specified by
          POSIX.)

   -i, --ignore-case
          Ignore  case  distinctions  in  both  the  PATTERN and the input
          files.  (-i is specified by POSIX.)

   -w, --word-regexp
          Select  only  those  lines  containing  matches  that form whole
          words.  The test is that the matching substring must  either  be
          at  the  beginning  of  the  line,  or  preceded  by  a non-word
          constituent character.  Similarly, it must be either at the  end
          of  the  line  or  followed by a non-word constituent character.
          Word-constituent  characters  are  letters,  digits,   and   the
          underscore.