Chris Chris - 2 months ago 9
HTML Question

UTF-8: showing correctly in database, however not in HTML despite utf-8 charset

I use MySQL 5.1 and loaded from a UTF-8 decoded txt-file about 2.7 mil lines into a table which itself is declared as

utf8_unicode_ci
and as well all char-fields are declared as
utf8_unicode_ci
, using
LOAD DATA INFILE
...

In the database itself the characters all seem to be correct, everything looks nice. However, when I print them using php, the characters show up as ???, although I use utf-8 declaration in the HTML head:

<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
...


In another table (using utf-8), where I inserted text from a submitted form, the characters appear strangely in the database, but are shown correctly again, when I print them using
SELECT...
.

So, I was wondering: what is wrong? Are
UTF-8
chars shown correctly in the database or strangely but when you
SELECT
them again they are OK? Or where is the problem (when loading the file into the db, in the HTML or somewhere in between)??

Thank you very much for any hint or suggestion! :)

Answer

Note: MySQL's utf8 charset is limited, it only supports Unicode characters in the BMP that take up no more than three bytes. You should be using utf8mb4 instead.

If phpMyAdmin displays your entered data as correct Unicode text, then my bet is that you are not doing SET NAMES utf8 after connecting.