Tang Tang - 3 months ago 8
Python Question

How to read files with messy unreadable name?

I have a lot of data files with unreadable name:

enter image description here

Within python, i can use glob.glob to find them.
But when i tried to use pandas to read the file, error occurs.
Here is my code:

import pandas as pd
import os
import glob
cwd=os.getcwd()
os.chdir(cwd)
for file in glob.glob("S*.xls"):
temp=pd.read_excel(file)


Here is the error message:

IOError: [Errno 22] invalid mode ('rb') or filename: 'Shibor\xa8\xbay?Y2006.xls'


May i ask, how can i find the files with name like "ShiborÊý¾Ý2015.xls" ?

Answer

Use unicode file names/path add a "u" prefix, like this:

for file in glob.glob(u"S*.xls"):
    temp=pd.read_excel(file)
Comments