Asara Asara - 5 days ago 5
PHP Question

preg_replace and hiden chars or hidden encoding

I have a preg replace pattern thats work quite good on phpliveregex.com:

(\>*\s?)_______________________________________________\n(\>*\s?)(talk|tagging|talk-us|talk-gb|talk-de|osm-talk) mailing list\n(\>*\s?)(talk|tagging|talk-us|talk-gb|talk-de|osm-talk)@openstreetmap.org\n(\>*\s?)https://lists.openstreetmap.org/listinfo/(talk|tagging|talk-us|talk-gb|talk-de|osm-talk)


for example here, it deletes all the mailinglist-signatures:

>> Text, blablabla
>>
>> _______________________________________________
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>talk mailing list
>talk@openstreetmap.org
>https://lists.openstreetmap.org/listinfo/talk

--
personal signature, blabla._______________________________________________
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


But when I try exactly the same in php with preg_replace, only the last of the three mailinglist signatures is deleted. And thats only with the given variable. When I echo the variable content to the browser, and copy that to a new variable like
$text = 'long echoed text'
it works.

$slugs = 'talk|tagging|talk-us|talk-gb|talk-de|osm-talk';
$pattern = '!(\>*\s?)_______________________________________________\n(\>*\s*)('.$slugs.') mailing list\n(\>*\s*)('.$slugs.')@openstreetmap.org\n(\>*\s*)https://lists.openstreetmap.org/listinfo/('.$slugs.')!mi';
return preg_replace($pattern,'',$text);


So I guess there must be some hidden encoding or hidden chars else in the original variable. But how can I find out whats the problem?

edit: it looks for me now like there is a problem with linebreaks and the
>
afterwards, but I still don't know how I could check it exactly and how to solve it.

edit2: when I try $text==$text2 (where $text is the original an $text2 is the result of echo $text), I get FALSE

TL;DR: when I use the given variable it does not work. But when I echo the variable to the browser, copy the text to a new variable, it works. what is hidden there?

Answer

Right now the above expression matches line breaks encoded as "\n". However, line breaks can also be encoded as "\n", "\r" and "\r\n", depending on the environment. So instead of \n, you should use:

[\n\r]+

See also this question and the corresponding article on Wikipedia.

Comments