Zalán Vajda Zalán Vajda - 4 days ago 5
Perl Question

Perl-regex word boundary equivalence

I read that the regex

\ba


is equivalent to

(?<!\w)a


but before that I had figured out that maybe

^a|\Wa


is equivalent too

My question is: What is the difference between those two? Could somebody write an example if they are not equivalent?

Answer

\b is equivalent to (?:(?<!\w)(?=\w)|(?<=\w)(?!\w)), so

\ba is equivalent to (?:(?<!\w)(?=\w)|(?<=\w)(?!\w))a, so

\ba is equivalent to (?<!\w)a because a matches \w.


Both \ba and (?<!\w)a are similar to both ^a|\Wa and (?:^|\W)a to the point of being occasionally interchangeable, but they are different because the former two match a single character and the latter two can match two. Compare:

say '!@a#$' =~ s/\ba//r;         # !@#$

say '!@a#$' =~ s/(?<!\w)a//r;    # !@#$

say '!@a#$' =~ s/^a|\Wa//r;      # !#$

say '!@a#$' =~ s/(?:^|\W)a//r;   # !#$
Comments