Janiaje Janiaje - 6 months ago 14
Linux Question

Linux Bash Script Regex malfunction

I would like to make a bash script, which should decide about the given strings, if they fulfill the term or not.

The terms are:


  • The string's first 3 character must be "le-"

  • Between hyphens there can any number of consonant in any arrangement, just one "e" and it cannot contain any vowel.

  • Between hyphens there must be something

  • The string must not end with hyphen



I made this script:

#!/bin/bash
# Testing regex

while read -r line; do
if [[ $line =~ ^le((-[^aeiou\W]*e+[^aeiou\W]*)+)$ ]]
then
printf "\""$line"\"\t\t\t-> True\n";
else
printf "\""$line"\"\t\t\t-> False\n";
fi
done < <(cat "$@")


It does everything fine, except one thing:
It says true no matter how many hyphens are next to each other.
For example:
It says true for this string "le--le"

I tried this regex expression on websites (like this) and they worked without this malfunction.
All I can think of there must be something difference between the web page and the linux bash. (All I can see on the web page is it runs PHP)

Do you have got any idea, how could I make it work ?

Thank you for your answers!

Answer

sweaver2112 rightly points out that the \W is causing you problems, but fails to provide a working example of a bash test regex that does what you ask (at least, i couldn't get it to work).

this seems to do it (adapting Laurel's consonant regex):

[[ "$line" =~ ^le(-[b-df-hj-np-tv-z]*e[b-df-hj-np-tv-z]*)+$ ]]

it matches (e.g.):

le-e
le-e-le
le-e-e-e-e-e

and more generally:

le-([[:consonant:]]*e[[:consonant:]]*)+

and doesn't match (e.g.):

le-
le--le
le-lea-le

also, you can write it more cleanly this way:

c='[b-df-hj-np-tv-z]'
[[ "$line" =~ ^le(-$c*e$c*)+$ ]]