user1620012 user1620012 - 1 year ago 58
Python Question

Reading files in a particular order in python

Lets say I have three files in a folder: file9.txt, file10.txt and file11.txt and i want to read them in this particular order. Can anyone help me with this?

Right now I am using the code

import glob, os
for infile in glob.glob(os.path.join( '*.txt')):
print "Current File Being Processed is: " + infile

and it reads first file10.txt then file11.txt and then file9.txt.

Can someone help me how to get the right order?

Answer Source

Files on the filesystem are not sorted. You can sort the resulting filenames yourself using the sorted() function:

for infile in sorted(glob.glob('*.txt')):
    print "Current File Being Processed is: " + infile

Note that the os.path.join call in your code is a no-op; with only one argument it doesn't do anything but return that argument unaltered.

Note that your files will sort in alphabetical ordering, which puts 10 before 9. You can use a custom key function to improve the sorting:

import re
numbers = re.compile(r'(\d+)')
def numericalSort(value):
    parts = numbers.split(value)
    parts[1::2] = map(int, parts[1::2])
    return parts

 for infile in sorted(glob.glob('*.txt'), key=numericalSort):
    print "Current File Being Processed is: " + infile

The numericalSort function splits out any digits in a filename, turns it into an actual number, and returns the result for sorting:

>>> files = ['file9.txt', 'file10.txt', 'file11.txt', '32foo9.txt', '32foo10.txt']
>>> sorted(files)
['32foo10.txt', '32foo9.txt', 'file10.txt', 'file11.txt', 'file9.txt']
>>> sorted(files, key=numericalSort)
['32foo9.txt', '32foo10.txt', 'file9.txt', 'file10.txt', 'file11.txt']