Wh1T3h4Ck5 Wh1T3h4Ck5 - 1 year ago 80
PHP Question

Why does the delimiter used affect the validity of a regex?

I use this pretty simple regular expression:


When I try to use it with some of PHP regex functions, such as preg_match() it throws warning in sole case when I use
character (tilde) as delimiter.

So, execution of following lines goes well

preg_match("/^[\x20-\x7E]+$/", $s); # delimiter "/"
preg_match("!^[\x20-\x7E]+$!", $s); # delimiter "!"
preg_match("#^[\x20-\x7E]+$#", $s); # delimiter "#"

but for some reason, this line

preg_match("~^[\x20-\x7E]+$~", $s); # delimiter "~"

throws a warning

Warning: preg_match(): Unknown modifier ']' in some_script.php on line XX

note: it happens only when it's used with double-quotes!

I'm using tilde all the time as delimiter and never faced problems with it until this case and really wonder why that happens. Can't find does tilde have some special meaning in regular expressions (i'm 99% now sure it does not), or it's just a bug.

I can certainly work around this, but the question is: What's the difference between tilde and any other delimiter?

Answer Source

You were using double quotes:


Which means that both \x20 and \x7E got interpreted in PHP string context, not by PCRE. Guess what \x7E amounts to.

So as @Bitwise mentioned, use single quotes. Or better yet escape the escape sequences:


Thus the regex engine will still see [\x20-\x7E] instead of [ -~].

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download