Angel Angel - 4 months ago 13
Bash Question

Multiple values in a regular expression in Unix - Multiple substrings possibles into a string of a expression

I have a problem with a regular expression due to I have to accept multiple possible values for some files in Unix. If the pattern matches is Case A, else Case B. Ie:

echo a | grep "^[a\|b\|c]$"
echo a | grep "^[b\|a\|c]$"
echo b | grep "^[a\|b\|c]$"
echo c | grep "^[a\|b\|c]$"

echo typeA | grep "^[typeA\|typeB]$"
echo typeA | grep "^[typeA\|typeB\|c]$"
echo typeA | grep "^[typeA\|typeB]$"
echo typeA | grep "^[typeA\|typeB]$"

With these examples, I get the following output



I really don't know why in the cases 5, 6, 7 and 8, I don't get an answer.

"Original code":

ls *.CTL > $ArchivosControl
for i in $(cat $ArchivosControl); do
pattern=`echo $i | grep '^fixedvalues[0-9]\{7\}_[OptionA1\|OptionA2\|OptionA3]_fixedvalues_[OptionB1\|OptionB2]\.CTL$'`
if [ "$pattern" != "" ]; then
echo $pattern >> $List
echo "It doesn't match for $i"

EDIT 2016-10-13 20:30

The answer of kennytm works in Linux:

echo t | grep "^[typeA\|typeB]$"

but I need that it works in Unix Server (AIX specifically)

echo P_typeA_123 | grep "^P_(typeA\|typeB)_[0-9]\{3\}$"

The optional text is into a major expression that I need to validate.

EDIT 2016-10-14 14:52

At the last of the string, I would like to search if exists a 'C' or 'H'. Should I use () or []. IE:

echo P_typeA_123N | grep -E "^P_(typeA\|typeB)_[0-9]\{3\}[N|H]$"
echo P_typeA_123N | grep -E "^P_(typeA\|typeB)_[0-9]\{3\}(N|H)$"

I has proove both options and I can't choose one.

PD: 'grep -E' is equal to 'egrep'? I can't find the difference.


You are using the wrong brackets.

$ #                   ↓↓            ↓↓
$ echo typeA | grep "^\(typeA\|typeB\)$"

The [] is used to construct character classes. [typeA\|typeB] means matching one of the character t, y, p, e, A, \, |, etc.

$ echo t | grep "^[typeA\|typeB]$"

What you want is grouping, which in Basic Regular Expression syntax is represented by \( … \).