Allen Allen - 4 months ago 8
Bash Question

Why did I get different answers when I changed grep to egrep in the latter half of each

$ egrep "^COMP[29]041" enrolments | grep "|F$" | wc -l
24
$ egrep "^COMP[29]041" enrolments | egrep "|F$" | wc -l
166
$


The content of file
enrolments
:

COMP2041|4836917|Ruld, Ruld |3978/2|M
COMP2041|4850109|Rvyiparzal, Ilbvuy |3979/3|M
COMP2041|2858836|Rzild, Fia Held |3730/4|M
COMP2041|4823158|Sheld, Yild |3978/2|M
COMP2041|4818044|Sheo, Sheo |3978/2|M
COMP2041|4818497|Sheo, Xa |3978/2|M
COMP9041|4899688|Shild, Ge |8680/2|M
COMP2041|4869506|Shild, Yild |3645/2|M
COMP9041|4897426|Shild, Yild |8680/2|M
COMP9041|4368551|Sho, Wuld |8684 |M
COMP2041|4339940|Shuld, Puaxail Baili |3978/3|F
COMP2041|4330093|Veh, Yeold-He |3711/3|M
COMP2041|2230267|Vikil, Ivrha |3978/3|F
COMP2041|4312663|Viy Chiobhova, Jiozrigh |3978/1|M
.......


The question is why I got different answers when I changed
grep
to
egrep
in the latter half of each.
What are the differences between
grep
and
egrep
?

Answer

In egrep (or, preferably, grep -E), the | is a metacharacter, whereas in plain grep it is a plain (non-meta) character.

The |F$ term in egrep looks for an empty string or F at the end of line; it finds an empty string on every line.

The same term in grep looks for a |F at the end of line. To look for that with egrep, you'd need to escape the metacharacter with a backslash: grep -E '\|F$' enrolments.

In short, the plain grep command understands Basic Regular Expressions (BRE). The egrep or 'extended grep' command understands Extended Regular Expressions (ERE). Some versions of grep (such as GNU grep) can be compiled to recognize Perl-Compatible Regular Expressions (PCRE).

Comments