Paul Stanley Paul Stanley - 5 months ago 18x
PHP Question

Latin-1 / UTF-8 encoding php

I have a db in UTF-8 encoding with a mixture of Latin-1. (I think that that is the problem)

This is how the characters look in the database.

Ä° (should be İ)

When I set the header to

<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

Then the characters come out as:


When I remove the header, they come out as they are in the database. I want them to come out like this:


I'm looking for a way to remedy this in PHP after the fact, if it is possible. I am unable to correct the data itself at this time, which would be the correct thing to do.

Jon Jon

Your HTML output needs to be in a single encoding, there is no way around that. This means that content in different encodings needs to be converted to your HTML encoding first. While that is possible to do with iconv or mb_convert_encoding, there are two problems you have to solve:

  1. You need to know (or guess) the current encoding of the content
  2. You need to do this manually, everywhere

For example, a theoretical solution would be to pick UTF-8 as your HTML encoding and then do this for all strings you are going to output:

$string = '...'; // from the database

// If it's not already UTF-8, convert to it
if (mb_detect_encoding($string, 'utf-8', true) === false) {
    $string = mb_convert_encoding($string, 'utf-8', 'iso-8859-1');

echo $string;

The code above assumes that non-UTF-8 content is encoded in latin-1, which is reasonable according to your question.