Gaurav Gaurav - 2 months ago 14
Perl Question

Regular Expression to match and split with operator in Perl

I need a regular expression that matches exact keywords as well as special characters or operators.

For instance, I have a string and I want to split this with regular expression.

my $data="long i = sbyte.MinValue ; i => sbyte.MaxValue ; > i++";


If I split this on the equals character
=
then it should return two strings:


  1. long i

  2. sbyte.MinValue ; i => sbyte.MaxValue ; > i++



If I split with
=>
then it will return:


  1. long i = sbyte.MinValue ; i

  2. sbyte.MaxValue ; > i++



Here is the example code:

my $key = "=";

my $data = "long i = sbyte.MinValue ; i => sbyte.MaxValue ; > i++";

#=~/\b$s\b/
#/\b$key\b/

my @matches = ( $data =~/\b$key\b/ );

my @string = split (/\b$key\b/, $data); # split ~ /^=$/, $data;

if ( scalar(@string) > 0 ) {

foreach my $item ( @string ) {
print "$item \n";
}
}
else {
print "Nothing found \n";
}


The issue comes with an operator to search and split in string.
Exact match works with keywords or other text, but with operators
like
=
,
>=
,
<=
,
!=
,
<<=
,
=>>
,
++
,
--
it's not working.

I need to search one by one and split the text.

Answer

You could try using a tokenizer to handle parsing the string for you, so as not to reinvent the wheel. Here is an example using PPI::Tokenizer

#!/usr/bin/env perl

use strict;
use warnings;

use List::MoreUtils qw( any );
use PPI::Tokenizer;

my @operators_i_care_about = qw( = => >= <= != <<= =>> ++ -- );

my $data = "long i = sbyte.MinValue ; i => sbyte.MaxValue ; > i++";

my $tokenizer = PPI::Tokenizer->new( \$data );

for my $token ( @{ $tokenizer->all_tokens } ) {
    if ( 'PPI::Token::Operator' eq ref $token
         and any { $_ eq $token->content } @operators_i_care_about ) {
         print "\nOPERATOR: $token\n";
    } else {
        print $token; # Stringifies
    }
}

Output

long i
OPERATOR: =
 sbyte.MinValue ; i
OPERATOR: =>
 sbyte.MaxValue ; > i
OPERATOR: ++
Comments