V. Tej V. Tej - 22 days ago 6
Perl Question

regex to grep a particular word while considering spaces occurring before the word

I am looking for the regex which takes the spacing issue into consideration. I have my code to do the following thing:
If the class extends from base_class then just push the current class name into the array, else grep for the extended class name and push both the extended class name and also the current class name into the array.

my $key = "class " . $current_class_name . " extends";
my $variable1 = "extends base_class";
if(/$key/){
if(/($variable1)/){ # Checking if it extends from "base_class"
push @test_list, $current_class_name ; # Pushing the test name if it extends from "base_class"
}
else { # If it doesn't extend from "base_class"
/.extends[\s]+([A-Za-z_0-9]+)/ ;
push @test_list, $1; # Pushing the extended test name into array
push @test_list, $current_class_name; # Pushing the current test name into array
}

}


I have 2 questions.
1) When grep for the string $key (
if(/$key/)
) how to consider the spacing issue i.e. if we have
class $current_class_name extends
, indicating there are many spaces between the string
class
and the
$current_class_name
and also similarly between
$current_class_name
and
extends
. If we observe the first line of my code, we can see that it considers that there is a single space between those strings. But I want to handle the situation for any number of spaces. (1 space to 10 spaces max).
So, please help me to handle this issue.

2) Similarly, when we take the word which is after extends in these line of code:

/.extends[\s]+([A-Za-z_0-9]+)/ ;
push @test_list, $1;


How do I take the word and push it, if the extended class name occurs after many spaces after the
extends
string.

I hope my explanations are clear. Please comment if any part of my question is unclear. I will edit it accordingly.

Thanks

Answer

A few recommendations for you:

  • + matches 1 or more iterations of the previous character/group

  • {<number>} matches that number of iterations of the previous character/group. So `{10} matches exactly 10 iterations.

  • {<number1>,<number2>} matches between number1 and number2 iterations of the previous character/group. So {1,10} matches between 1 and 10 iterations, {2,} matches 2 or more iterations, {,10} matches between 0 and 10 iterations.

  • \s matches whitespace, so tabs and spaces

  • I suggest trying out string interpolation, as it is one of my favorite things about Perl. i.e. "class $current_class_name extends" instead of "class " . $current_class_name . " extends". String interpolation works for double quotes, but not single quotes.

  • This falls under style, but I generally don't create variables if it is only going to be used in one place.

  • Always test that your regex matches before you use $1, or else it will be the result of the previous successful regex match.

Example:

if (/class\s+$current_class_name\s+extends/) {
    if (/(extends base_class)/) {
        push @test_list, $current_class_name;
    }
    elsif (/extends\s+([A-Za-z_0-9]+)/) {
        push @test_list, $1;
        push @test_list, $current_class_name;
    }
    else {
        # not sure what you want to do in this case, looks like it
        # would be a syntax error assuming this is Java
    }
}

You can change

/class\s+$current_class_name\s+extends/

to

/class\s{1,10}$current_class_name\s{1,10}extends/

if you want to keep to the 1-10 space limit. \s matches tabs too, so if you really only want to accept spaces you can change it to

/class[ ]{1,10}$current_class_name[ ]{1,10}extends/