Angel Angel - 1 month ago 6
Bash Question

Multiple values in a regular expression in Unix - Multiple substrings possibles into a string of a expression

I have a problem with a regular expression due to I have to accept multiple possible values for some files in Unix. If the pattern matches is Case A, else Case B. Ie:

echo a | grep "^[a\|b\|c]$"
echo a | grep "^[b\|a\|c]$"
echo b | grep "^[a\|b\|c]$"
echo c | grep "^[a\|b\|c]$"

echo typeA | grep "^[typeA\|typeB]$"
echo typeA | grep "^[typeA\|typeB\|c]$"
echo typeA | grep "^[typeA\|typeB]$"
echo typeA | grep "^[typeA\|typeB]$"


With these examples, I get the following output

a
a
b
c

(empty)
(empty)
(empty)
(empty)


I really don't know why in the cases 5, 6, 7 and 8, I don't get an answer.

"Original code":

ls *.CTL > $ArchivosControl
for i in $(cat $ArchivosControl); do
pattern=`echo $i | grep '^fixedvalues[0-9]\{7\}_[OptionA1\|OptionA2\|OptionA3]_fixedvalues_[OptionB1\|OptionB2]\.CTL$'`
if [ "$pattern" != "" ]; then
Cantidad_Control=$((Cantidad_Control+1))
echo $pattern >> $List
else
echo "It doesn't match for $i"
fi
done


EDIT 2016-10-13 20:30

The answer of kennytm works in Linux:

echo t | grep "^[typeA\|typeB]$"


but I need that it works in Unix Server (AIX specifically)

echo P_typeA_123 | grep "^P_(typeA\|typeB)_[0-9]\{3\}$"


The optional text is into a major expression that I need to validate.

EDIT 2016-10-14 14:52

At the last of the string, I would like to search if exists a 'C' or 'H'. Should I use () or []. IE:

echo P_typeA_123N | grep -E "^P_(typeA\|typeB)_[0-9]\{3\}[N|H]$"
echo P_typeA_123N | grep -E "^P_(typeA\|typeB)_[0-9]\{3\}(N|H)$"


I has proove both options and I can't choose one.

PD: 'grep -E' is equal to 'egrep'? I can't find the difference.

Answer

You are using the wrong brackets.

$ #                   ↓↓            ↓↓
$ echo typeA | grep "^\(typeA\|typeB\)$"
typeA

The [] is used to construct character classes. [typeA\|typeB] means matching one of the character t, y, p, e, A, \, |, etc.

$ echo t | grep "^[typeA\|typeB]$"
t

What you want is grouping, which in Basic Regular Expression syntax is represented by \( … \).

Comments