Joseph Walker Joseph Walker - 1 year ago 57
Perl Question

how to loop through an array to find more than one pattern using perl regex?

I'm trying to find two patterns within an array and put the results into another array.

For example

$/ = "__Data__";

#SCSI_test # put this line into @arrayNewLines
- ccccccccccccccc # put this line into @arrayNewLines


my @arrayOld = split(\n,@array);

foreach my $i (0 .. $#arrayOld)
if($arrayOld[$i] =~ /^-(.*)/g or /\#(.*)/g)
my @arrayNewLines = $arrayOld[$i];
print "@arrayNewLines\n";

This code only prints out only ccccccccccccccc
But I would like it to output ccccccccccccccc #SCSI_test

Answer Source

That code does not print just cccccc..., it prints everything. Your problem is this line:

if($arrayOld[$i] =~ /^-(.*)/g or /\#(.*)/g) {

What you are doing here is first checking $arrayOld[$i] and then checking $_, because /\#(.*)/ is perl shorthand for $_ =~ /\#(.*)/. Since the line contains a hash character #, it will always match, and the line will always print.

Your line is equivalent to:

if(   $arrayOld[$i] =~ /^-(.*)/g 
      $_ =~ /\#(.*)/g) {

The answer there is to join the regexes:

if($arrayOld[$i] =~ /^-|#/) {

However, your code is far from clean after that... starting from the top:

If you set the input record separator $/ to __Data__ with that input, you will get two records (Data::Dumper output shown below):

$VAR1 = '__Data__';
$VAR1 = '
#SCSI_test         # put this line into  @arrayNewLines
- ccccccccccccccc  # put this line into @arrayNewLines

When you chomp the records, you will remove __Data__ from the end, so the first line will become empty. So in essence, you will always have a leading empty field. This is nothing horrible, but something to remember.

Your split statement is wrong. First off, the first argument should be a regex: /\n/. The second argument should be a scalar, not an array. split(/\n/,@array) will evaluate to split(/\n/, 2), because the array is in scalar context and returns its size instead of its elements.

Also, of course, since you are in a loop reading lines from the FILEREAD handle, that @array array will always contain the same data, and has nothing to do with the data from the file handle. What you want is: split /\n/, $_.

This loop:

foreach my $i (0 .. $#arrayOld) {

is not a very good loop structure for this problem. Also, there is no need to use an intermediate array. Just use:

for my $line (split /\n/, $_) {

When you do

my @arrayNewLines = $arrayOld[$i];
print "@arrayNewLines\n";

You are setting the entire array to a scalar, then printing it, which is completely redundant. You get the same effect just printing the scalar directly.

Your code should look like this:

while(<FILEREAD>) {
    foreach my $line (split /\n/, $_) {
        if($line =~ /^-|#/) {
            print "$line\n";

It is also recommended that you use lexical file handles, so instead of

open FILEREAD, "somefile" or die $!;       # read with <FILEREAD>


open my $fh, "<", "somefile" or die $!;    # read with <$fh>