manu sharma manu sharma - 1 year ago 143
Python Question

Python: parse all files in a folder

I am trying to parse allthe files in a folder with help of a python loop and then store it as a dataframe, I am using following script


for filename in os.listdir(path):
tree = ET.parse(filename)
a = ET.tostring(tree.getroot(), encoding='utf-8', method='text')
c = a.replace('\n', '')
df = df.append({'text': c, 'type': 'abc'}, ignore_index=True)

and my path file has following files


every time, I ran my code it show me an error

IOError: [Errno 2] No such file or directory: 'abc1'

though it is there, where am I making an error? Appreciate every help

Answer Source

os.listdir() returns only filenames (not full paths).

You can try to use glob.glob(path + '/*.xml') instead of os.listdir(path)


In [111]: path = 'd:/temp/xml'

In [112]: os.listdir(path)
Out[112]: ['1.xml', '2.xml', '3.xml', 'bla.tmp']

In [113]: glob.glob(path + '/*.xml')
Out[113]: ['d:/temp/xml\\1.xml', 'd:/temp/xml\\2.xml', 'd:/temp/xml\\3.xml']
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download