algoProg algoProg - 1 month ago 15
Python Question

Python same script opens files in order in one dir and not in another

for file screening I have following code in two different directories:

import os, re

g=open('results_1.txt', 'w') #Other has 'results_2.txt'

for filename in os.listdir('.'):
if filename.startswith("f"):
with open(filename, 'r') as f:
content =[line.rstrip() for line in f]

A=filter(lambda x: 'KeyWord_1 :' in x, content)
B=filter(lambda x: 'KeyWord_2 :' in x, content)

print >> g,filename,

for item in A:
print >> g,item,
for item in B:
print >> g,item,

g.close()


Both directories has similar file (to be parsed my script) naming convention. So files look like this: "file_1000.txt", "file_100.txt", "file_101.txt",.....,"file_1.txt",......"file_9.txt".

I change the script just to change the name of results file. But in one directory the files are sorted from _1 to _1000 and then results file has appropriate order while other does not. Why?

I am sorry this is related to my work and I can give any specifics. Thank you.

P.S. I tried sorted function and it did not work as I wanted.

Answer

From the documentation on os.listdir:

Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order, and does not include the special entries '.' and '..' even if they are present in the directory.

You need to sort the result using a preferred sort order. You vaguely point out that the resulting order wasn't as expected when you tried sorting it, which I take to mean that you probably do not want a lexicographical sort, but a numeric sort on the trailing numbers in the filename:

def trailing_number(filename):
    return int(filename.split('_')[1].rstrip('.txt'))

sorted(os.listdir('.'), key=trailing_number)

Adapt the above to handle the real format of your filenames. Also don't forget to handle exceptions in trailing_number which can arise if some of your filenames don't conform to the same format.