Cofey Cofey - 5 months ago 14
PHP Question

How to convert HTML entities like – to their character equivalents?

I am creating a file that is to be saved on a local user's computer (not rendered in a web browser).

I am currently using

html_entity_decode
, but this isn't converting characters like
–
(which is the n-dash) and was wondering what other function I should be using.

For example, when the file is imported into the software, instead of the ndash or just a - it shows up as
–
. I know I could use
str_replace
, but if it's happening with this character, it could happen with many others since the data is dynamic.

Answer

You need to define the target character set. – is not a valid character in the default ISO-8859-1 character set, so it's not decoded. Define UTF-8 as the output charset and it will decode:

echo html_entity_decode('–', ENT_NOQUOTES, 'UTF-8');

If at all possible, you should avoid HTML entities to begin with. I don't know where that encoded data comes from, but if you're storing it like this in the database or elsewhere, you're doing it wrong. Always store data UTF-8 encoded and only convert to HTML entities or otherwise escape for output when necessary.