Will Bolden Will Bolden - 7 days ago 6
Perl Question

Why doesn't this perl regex work

@matches = ( $filestr =~ /^[0-9]+\. (.+\n)*/mg );


I have a file that's been read into filestr, yet for some reason the above regex, which should match the beginning of a line, followed by a number, a dot, a space, and then any number of lines followed by a newline (thus ending when there is a line with only a newline on it), seems to just produce some single lines from the file.

When I do something like

@matches = ( $filestr =~ /^[0-9]+\. .+\n/mg );


I correctly match a single line.

When I do this

@matches = ( $filestr =~ /^[0-9]+\. .+\n.+\n/mg );


I match the same single lines, followed by some seemingly unrelated line. What's wrong with my regex?

Note: The regex works fine in this regex tester:https://regex101.com/, it just won't work in perl.

Example, in this text:

1. This should
match

2. This should too

3. This
one
also


the regex should match

1. This should
match


and

2. This should too


and

3. This
one
also

Answer

Your regex is right. But, you are capturing the result partially. I would suggest you to capture the whole match into a single result-set and that's how it is going to be stored into @matches.

So, the correct regex would become /(^[0-9]+\. (?:.+\n)*)/gm. In this way you are capturing the matched result into $1. Wrapping it up into a program yields.

Although, it is going to work without keeping those parenthesis(...) also because by default it takes $&(i.e whole match) unless you capture anything. So, remember in these cases you should use non-capturing group(?: ... ) instead of capturing group().

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my $str = '
1. This should
match

2. This should too

3. This
one
also
';

my @matches = $str =~ /^([0-9]+\. (?:.+\n)*)/gm;

print Dumper(\@matches);

Output:

[
          '1. This should
match
',
          '2. This should too
',
          '3. This
one
also
'
        ];
Comments