BrianH BrianH - 5 months ago 7x
Perl Question

Is there a better way to write Perl regexes with /x so the code is still easy to read?

I ran Perl::Critic on one of my scripts, and got this message:

Regular expression without "/x" flag at line 21, column 26. See page 236 of PBP.

I looked up the policy information here, and I understand that writing regular expressions in extended mode will help anyone who is looking at the code.

However, I am stuck as how to convert my code to use the /x flag.

CPAN Example:

# Match a single-quoted string efficiently...

m{'[^\\']*(?:\\.[^\\']*)*'}; #Huh?

# Same thing with extended format...

' # an opening single quote
[^\\'] # any non-special chars (i.e. not backslash or single quote)
(?: # then all of...
\\ . # any explicitly backslashed char
[^\\']* # followed by an non-special chars
)* # ...repeated zero or more times
' # a closing single quote

This makes sense if you only look at the regex.

My Code:

if ($line =~ /^\s*package\s+(\S+);/ ) {

I am not exactly sure how to use an extended regex inside of an if statement. I can write it like this:

if (
$line =~ /
^\s* # starting with zero or more spaces
\s+ # at least one space
(\S+) # capture any non-space characters
; # ending in a semi-colon

And this works, but I think this is almost harder to read than the original. Is there a better way (or a best practice way) to write this? I guess I could create a variable using qr//.

I'm not really looking for advice on re-writing this specific regex (although if I can improve it, I'll take advice) - I'm more looking for advice on how to expand a regex inside of an if statement.

I know Perl::Critic is just a guideline, but it would be nice to follow it.

Thanks in advance!

So after receiving a few answers, it became clear to me that making a regex multi-line with comments is not always necessary. People who understand basic regex should be able to understand what my example was doing - the comments I added were maybe a little unnecessary and verbose. I like the idea of using the extended regex flag, but still embedding spaces in the regex to make each part of the regex a little more clear.
Thanks for all the input!


Well, I really don't think you should waste vertical screen real estate on this. On the other hand, if I were to write this pattern over several lines, I would have used braces and indented the pattern:

if ($line =~ m{
        \A \s*
        \s* ;
) {

IMHO, the following version is perfectly fine:

if ( $line =~ m{ \A \s* package \s+ (\S+) \s* ; }x  ) {

in terms of getting the benefit of m//x.

The comments are completely unnecessary in this case because you are not doing anything tricky. I did add \s* before the semi-colon because sometimes people set the semi-colon apart from the package name and that should not throw off your match.