angelcool.net angelcool.net - 2 months ago 8
PHP Question

Negative look ahead not working as expected

I have a bizarre situation where positive lookahead works as expected but negative lookahead doesn't. Please take a look at the following code:

<?php

$tweet = "RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #showcase our emerging #startup ecosystem. Learn more! https://example.net #Riseof…";

$patterns=array(
'/#\w+(?=…$)/',
);

$tweet = preg_replace_callback($patterns,function($m)
{
switch($m[0][0])
{
case "#":
return strtoupper($m[0]);
break;
}
},$tweet);


echo $tweet;


I want to match any hashtag not followed by
…$
and upper case it (in reality it will be parsed out with an
href
but for simplicity's sake just upper case it for now ).

These are regexes with their corresponding outputs:

'/#\w+(?=…$)/'
Match any hashtag ending with
…$
and upper-case it, works as expected:

RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #showcase our emerging #startup ecosystem. Learn more! https://example.net #RISEOF…


'/#\w+(?!…$)/'
Match any hashtag NOT ending with
…$
and upper-case it, does not work, all hashtags are uppercased:

RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #SHOWCASE our emerging #STARTUP ecosystem. Learn more! https://example.net #RISEOf…


Thank you ver much for any help, suggestions, ideas and patience.

-- Angel

Answer

That is because of backtracking that matches the part of a hashtag. Use a possessive quantifier to avoid backtracking into the \w+ subpattern:

/#\w++(?!…$)/
    ^^

See the regex demo

Now, 1 or more word chars are matched, and the (?!…$) negative lookahead is only executed once after these word chars matched. If there is a false result, no backtracking occurs, and the whole match is failed.

See more on possessive quantifiers here.