selenocysteine selenocysteine - 4 months ago 8
Perl Question

Matching two overlapping patterns with Perl

I hope that my question has not already been posed by someone else, since I tried to look almost everywhere in the site but I couldn't manage to find an answer.

My problem is: I'm making a PERL script which has to detect the position of every occurrence of one or another pattern in a string.

For instance:

$string = "betaalphabetabeta";
$pattern = "beta|alpha";


In this case, I would like my script to return 4 matches.

I thought that this could be easily achieved by using the match operator in someway like this:

$string =~ /beta|alpha/g;


However, since my two patterns ("alpha", "beta") are partially overlapping, the piece of code that I've just posted skips any occurrence of the first pattern when it overlaps with the second one.

E.g. if I have a string like this one:

$string = "betalphabetabeta";


it only returns 3 matches instead of 4.

I've tried to do something with the ?= operator, but I can't manage to couple it with the OR operator in a correct way...

Does anyone have any solution? Thanks for your help!

Answer

The following uses a zero-width assertion (I believe that's what it's called).

#!/usr/bin/perl
use strict;
use warnings;

$_ = "betalphabetabeta";

while (/(?=(alpha|beta))/g) {
    print $1, "\n"; 

Prints:

C:\Old_Data\perlp>perl t9.pl
beta
alpha
beta
beta
Comments