Artjom Kurapov Artjom Kurapov - 4 months ago 42
JSON Question

json_encode(): Invalid UTF-8 sequence in argument

I'm calling

json_encode()
on data that comes from a MySQL database with
utf8_general_ci
collation. The problem is that some rows have weird data which I can't clean. For example symbol
, so once it reaches
json_encode()
, it fails with
json_encode(): Invalid UTF-8 sequence in argument
.

I've tried
utf8_encode()
and
utf8_decode()
, even with
mb_check_encoding()
but it keeps getting through and causing havoc.

Running PHP 5.3.10 on Mac. So the question is - how can I clean up invalid utf8 symbols, keeping the rest of data, so that
json_encoding()
would work?

Update. Here is a way to reproduce it:

echo json_encode(pack("H*" ,'c32e'));

Answer

Seems like the symbol was Å, but since data consists of surnames that shouldn't be public, only first letter was shown and it was done by just $lastname[0], which is wrong for multibyte strings and caused the whole hassle. Changed it to mb_substr($lastname, 0, 1) - works like a charm.

Comments