gulty gulty - 1 year ago 263
PHP Question

PHP to MySql to CSV to Excel UTF-8

I know this has been discussed several times but yet I'm getting crazy dealing with this problem. I have a form with a submit.php action. At first I didn't change anything about the charsets, I didn't use any utf8 header information.. The result was that I could read all the ä,ö,ü etc correctly inside the database. Now exporting them to

and importing them to Excel as
UTF-8 charset
(also tested all the others) results in an incorrect charset.

Now what I tried:


header("Content-Type: text/html; charset=utf-8");

I dropped my database and created a new one:

create database db CHARACTER SET utf8 COLLATE utf8_general_ci;
create table ...

I changed my my.cnf and restarted my sql server:



If I connect to my db via bash I receive the following output:

| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/charsets/ |

A php test:


Giving me:

Current character set: utf8 object(stdClass)#3 (8) { ["charset"]=> string(4) "utf8" ["collation"]=> string(15) "utf8_general_ci" ["dir"]=> string(0) "" ["min_length"]=> int(1) ["max_length"]=> int(3) ["number"]=> int(33) ["state"]=> int(1) ["comment"]=> string(13) "UTF-8 Unicode" }

Now I use:

mysql -uroot -ppw db < require.sql > /tmp/test.csv

require.sql is simply a

select * from table;

And again I'm unable to import it as a csv into Excel no matter if I choose UTF-8 or anything else. It's always giving me some crypto..

Hopefully someone got a hint what might went wrong here..


E: TextMate is giving me a correct output so it seems that the conversion actually worked and it's and Excel issue? Using Microsoft Office 2011.

E2: Also tried the same stuff with latin1 - same issue, cannot import special characters into excel without breaking them. Any hint or workaround?

E3: I found a workaround which is working with the Excel Import feature but not with double clicking the .csv.

iconv -f utf8 -t ISO-8859-1 test.csv > test_ISO.csv

Now I'm able to import the csv into excel using Windows(ANSI). Still annoying to have to use this feature instead of doubleclicking. Also I really don't get why UTF8 isn't working, not even with the import feature, BOM added and the complete database in UTF8.

Comma separation turned out to be a mess as well.
1. Concat_WS works only partly because it's adding a stupid concat_ws(..) header to the .csv file. Also "file test.csv" doesn't give me a "comma separated". This means even tho everything is separated by commas Excel won't notice it using double click.
2. sed/awk: Found some code snippets but all of them were separating the table very badly. E.g. colum street "streetname number" remained a 'streetname','number' which made 2 colums out of one and the table was screwed.

So it seems to me that Excel can only open .csv with a double click which
a) Are encoded with ISO-8859-1 (and only under windows because standard mac charset is Macintosh)
b) File having the attribute "comma separated". This means if I create a .csv through Excel itself the output of

file test1.csv

would be

test1.csv: ISO-8859 text, with CRLF line terminators

while a iconv changed charset with RegEx used for adding commas would look like:

test1.csv: ISO-8859 text

Pretty weird behaviour - maybe someone got a working solution.

Answer Source

Thanks everyone for the help, I finally managed to get a working - double clickable csv file which opens separated and displaying the letter correctly. For those who are interested in a good workflow here we go:

1.) My database is completely using UTF8. 2.) I export a form into my database via php. I'm using mysqli and as header information:

header("Content-Type: text/html; charset=ISO-8859");

I know this makes everything look crappy inside the database, feel free to use utf8 to make it look correctly but it doesn't matter in my case.

3.) I wrote a script executed by a cron daemon which a) removes the .csv files which were created previously

rm -f path/to/csv ##I have 3 due to some renaming see below

b) creating the new csv using mysql (this is still UTF8)

mysql -hSERVERIP -uUSER -pPASS DBNAME -e "select * from DBTABLE;" > PATH/TO/output.csv

Now you have a tab separated .csv and (if u exported from PHP in UTF8) it will display correctly in OpenOffice etc. but not in Excel. Even an import as UTF8 isn't working.

c) Making the file SEMICOLON separated (Excel standard, double clicking a comma separated file won't work at least not with the european version of Excel). I used a small python script

import sys
import csv

tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, delimiter=";")
for row in tabin:

d) Now I had to call the script inside my cron sh file:

/usr/bin/python PATH/TO/ < output.csv > output_semi.csv

Make sure you use the full path for every file if u use the script as cron.

e) Change the charset from UTF8 to ISO-8859-1 (Windows ANSI Excel standard) with iconv:

iconv -f utf8 -t ISO-8859-1 output_semi.csv > output_final.csv

And that's it. csv opens up on double click on Mac/Windows Excel 2010 (tested).

Maybe this is a help for someone with similar problems. It drove me crazy.

Edit: For some servers you don't need iconv because the output from the database is already ISO8859. You should check your csv after executing the mysql command:

file output.csv

Use iconv only if the charset isn't iso8859-1

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download