MattBelanger MattBelanger - 6 months ago 12
Converting HTML Entities in UTF-8 to SHIFT_JIS

I am working with a website that needs to target old, Japanese mobile phones, that are not Unicode enabled. The problem is, the text for the site is saved in the database as HTML entities (ie, Ӓ). This database absolutely cannot be changed, as it is used for several hundred websites.

What I need to do is convert these entities to actual characters, and then convert the string encoding before sending it out, as the phones render the entities without converting them first.

I've tried both

, but all they are doing is converting the encoding of the entities, but not creating the text.

Thanks in advance


I have also tried
. It is producing the same results - an unconverted string.

Here is the sample data I am working with.

The desired result: シェラトン・ヌーサリゾート&スパ

The HTML Codes:

The output of
html_entity_decode([the string above],ENT_COMPAT,'SHIFT_JIS');
is identical to the input string.


Just take care you're creating the right codepoints out of the entities. If the original encoding is UTF-8 for example:

$originalEncoding = 'UTF-8'; // that's only assumed, you have not shared the info so far
$targetEncoding = 'SHIFT_JIS';
$string = '... whatever you have ... ';
// superfluous, but to get the picture:
$string = mb_convert_encoding($string, 'UTF-8', $originalEncoding);
$string = html_entity_decode($string, ENT_COMPAT, 'UTF-8');
$stringTarget = mb_convert_encoding($string, $targetEncoding, 'UTF-8');