Arya Arya - 5 months ago 22
PHP Question

How to print Hexadecimal UTF-8 characters in PHP

How to print UFT-8 Characters from their Hexadecimal UTF-8 values? I read this post, but it did not solve my problem...

I work with many strings that are sanskrit words stored in a database. I have their HTML values, 16 bit binary code points, hex codes, and decimal codes, but I want to be able to work with their Hexadecimal UTF-8 values and output their symbolic form.

For example, here is a word

आम
that has a Binary UTF-8 value of
111000001010010010111000111000001010010010101110
. I want to see/store/print its Hexadecimal UTF-8 value and print its symbolic form.

For example, here's a snippet of my code:

$BinaryUTF8 = "111000001010010010000110111000001010010010101110";

$Temporary = dechex(bindec($BinaryUTF8));

$HexadecimalUTF8 = NULL;

for($i = 0; $i < strlen($Temporary); $i+=2)
{
$HexadecimalUTF8 .= "\x".$Temporary[$i].$Temporary[$i+1];
}

$Test = "\xe0\xa4\x86\xe0\xa4\xae";

echo "\$Test = ".$Test;

echo "<br>";

echo "\$HexadecimalUTF8 = ".$HexadecimalUTF8;


The output is:

$Test = आम
$HexadecimalUTF8 = \xe0\xa4\x86\xe0\xa4\xae


$Test output the desired characters.

Why does $HexadecimalUTF8 not output the desired characters?

Answer

Your binary is wrong (I have fixed it below)

You are making a string containing the text "\xe0" instead of the character which represents that, The hex is just a number really.

This seems to work now

<?php
$BinaryUTF8 = "111000001010010010000110111000001010010010101110";

$Temporary = dechex(bindec($BinaryUTF8));

$HexadecimalUTF8 = NULL;

for($i = 0; $i < strlen($Temporary); $i+=2)
{
    $HexadecimalUTF8 .= '\x' . $Temporary[$i].$Temporary[$i+1];
}

$Test = "\xe0\xa4\x86\xe0\xa4\xae";

echo "\$Test = ".$Test;

echo "<br>";
echo "\$HexadecimalUTF8 = " . makeCharFromHex($HexadecimalUTF8);

function makeCharFromHex($hex) {
    return preg_replace_callback(
        '#(\\\x[0-9A-F]{2})#i',
        function ($matches) {

            return chr(hexdec($matches[1]));
        },
        $hex
    );
}

This question reminds me how poor PHP is for multi byte support

Comments