whoopididoo whoopididoo - 1 year ago 46
Python Question

read an excel file and create a final text file from it including polish characters

I need to be able to read in an excel file that has lots of polish characters. Then I need to be able to write this file to a text file keeping the polish characters.
So far I can only open the file an write it but every time it wants to write the unicode values. As you can see from my code I strip out the u' before I write the file but the unicode values are another thing.

I end up something like this when I open the text file

[29178.0, Firma handlowa', Sklep farbiarsko-chemiczny', A-ZET ZHU
CIEBIELSKI ZENO', LWOWEK', 7880005802.0, CW PS', \u0141uczak
Rafa\u0142', ciebielski1@wp.pl', Nie', ', 17242.364799999999, 1061.48,
0.061562321196220141, Nie', 0.0, 1.0]

But I want it to look like this...

29,178 Firma handlowa Sklep farbiarsko-chemiczny A-ZET ZHU CIEBIELSKI
ZENO LWOWEK 7880005802 CW PS Łuczak Rafał ciebielski1@wp.pl Nie

wb = xlrd.open_workbook(xl_workbook.xls)

#Get the sheet names
sheets = wb.sheet_names()

for sheet in sheets:

#open the first sheet

sh = wb.sheet_by_name(sheet1[0])

with open(xl_workbook.txt', "wb") as f:
for rownum in xrange(sh.nrows):
stri_line=str([val for val in sh.row_values(rownum)])
stri =str(stri_line.replace("u'",""))


There must be a way to do this. Any help is very much appreciated.

Answer Source

Replace the last part of your code with this:

with open(xl_workbook+'.csv.txt', "wb") as f:
    sep=u"\t" #tab character
    for rownum in xrange(sh.nrows):
        stri_line=sep.join(unicode(val) for val in sh.row_values(rownum))
        stri=stri_line.encode("utf8") # or any encoding you want that supports polish characters
        f.write("\n") #newline