UPDATE: Anwer is here PHP unserialize fails with non-encoded characters?
I'm trying to match objects with in_array. This works fine except for the object with this string as a property. Visually they are the same, but when I do a var_dump PHP sees different lengths.
string(26) "Waar zijn mijn centjes
Let's look at the hex dump of your strings:
As we can clearly see, there's only a difference in the end:
So what's the difference?
f09f91bc is the hex representation of
U+1F47C BABY ANGEL character (👼), so that one is perfect UTF-8.
26237831663437633b isn't UTF-8 anymore, the string is actually ASCII and translates to
👼, so it's simply HTML's numeric character reference of the baby angel character.
So the angel must have somewhere been translated to its HTML numeric character reference and that is not something that happens just when writing and reading from a file or a DB. I'd guess it has happened somewhere in your output processing.
You may use
html_entity_decode to translate the HTML entities back to their UTF-8 equivalent:
$a = html_entity_decode('Waar zijn mijn centjes👼'); $b = 'Waar zijn mijn centjes