Devon Devon - 6 months ago 12
PHP Question

HTML entities issue on certain special characters

In the database, there are some special characters (most likely copied from a Word processor) that are not being properly converted for viewing in HTML.

To diagnose this further, I have just written out the data to a text file.

file_put_contents('/tmp/text.txt', $row['text']."\n\n" . htmlentities($row["text"]) . "\n\n", FILE_APPEND);

When comparing, I see this:

# grep "and knows what" text.txt
and knows what “good” looks like.
and knows what �good� looks like.

Any idea why the conversion is being thrown off?

This may already be covered somewhere but not exactly easy to search for special characters.


Resolved with htmlentities encoding utf-8. I had tried utf8, but it requires the hyphen.

htmlentities($str, $flags, 'utf-8');



htmlentities($yourString, $flags, "UTF-8")

If that doesn't solve it:

If everything isn't setup for UTF-8 on your server, your text may get incorrectly converted somewhere along the way. You may need to enable UTF-8 encoding for both Apache and PHP right in their config files. (If you're not using Apache, try skipping that step.)

Apache UTF-8 setup:

Edit either your charset.conf (ideal), or httpd.conf file, by adding this line to the end:

AddDefaultCharset utf-8

(If you don't have access to Apache's config files, you can create a ".htaccess" file in your HTML's root directory with that same code.)

PHP UTF-8 setup:

Edit your php.ini file, searching for "default_charset", and change it to:

default_charset = "utf-8"

Restart Apache:

Depending on your server type, this command may do the trick via command line:

sudo service apache2 restart

Source:Proper character encoding to display "”"?