elterr1ble elterr1ble - 1 month ago 17
PHP Question

PHP preg_match/replace doesn't work on character "ndash" after file_get_contents

I get a String via

file_get_contents($file)
.

Why can't I replace a "–" (not "minus" but HTML
–
) with PHP's preg_replace function? preg_match also doesn't work:

e.g.

The output of the
$file
is "blah – blah".

$str = file_get_contents($file);
$str = preg_replace('/–/', 'test', $str);
echo $str;


should return
blah test blah
but returns
blah – blah
.

Whey is that and how can I replace a ndash instead?

Thanks for your help!

Answer

It seems the file contains an HTML entity for the long dash, and in order to get the plain text with you need to use html_entity_decode first.

Use

$str = preg_replace('/–/', 'test', html_entity_decode($str));
                                   ^^^^^^^^^^^^^^^^^^^^^^^^

PHP demo:

$str = 'blah – blah';
echo "Original: " . $str . "\n";
$str = preg_replace('/–/', 'test', html_entity_decode($str));
echo "Replaced: " .  $str;

Output:

Original: blah – blah
Replaced: blah test blah