Wh1T3h4Ck5 Wh1T3h4Ck5 - 5 months ago 23
PHP Question

PHP throws warning when tilde is used as regex delimiter?

I use this pretty simple regular expression:

^[\x20-\x7E]+$


When I try to use it with some of PHP regex functions, such as preg_match() it throws warning in sole case when I use
~
character (tilde) as delimiter.

So, execution of following lines goes well

preg_match("/^[\x20-\x7E]+$/", $s); # delimiter "/"
preg_match("!^[\x20-\x7E]+$!", $s); # delimiter "!"
preg_match("#^[\x20-\x7E]+$#", $s); # delimiter "#"


but for some reason, this line

preg_match("~^[\x20-\x7E]+$~", $s); # delimiter "~"


throws a warning

Warning: preg_match(): Unknown modifier ']' in some_script.php on line XX


note: it happens only when it's used with double-quotes!

I'm using tilde all the time as delimiter and never faced problems with it until this case and really wonder why that happens. Can't find does tilde have some special meaning in regular expressions (i'm 99% now sure it does not), or it's just a bug.

I certainly can make it work but my question is: What's the difference between tilde and any other delimiter?

Answer

You were using double quotes:

 "~^[\x20-\x7E]+$~"

Which means that both \x20 and \x7E got interpreted in PHP string context, not by PCRE. Guess what \x7E amounts to.

So as @Bitwise mentioned, use single quotes. Or better yet escape the escape sequences:

 "~^[\\x20-\\x7E]+$~"

Thus the regex engine will still see [\x20-\x7E] instead of [ -~].

Comments