I am writing some scripts to process some text files in python. Locally the script reads from a single txt file thus i use
index_file = open('index.txt', 'r')
for line in index_file:
import os from glob import glob def readindex(path): pattern = '*.txt' full_path = os.path.join(path, pattern) for fname in sorted(glob(full_path)): for line in open(fname, 'r'): yield line # read lines to memory list for using multiple times linelist = list(readindex("directory")) for line in linelist: print line,
This script defines a generator (see this question for details about generators) to iterate through all the files in directory "directory" that have extension "txt" in sorted order. It yields all the lines as one stream that after calling the function can be iterated through as if the lines were coming from one open file, as that seems to be what the question author wanted. The comma at the end of print line, makes sure that newline is not printed twice, although the content of the for loop would be replaced by question author anyway. In that case one can use line.rstrip() to get rid of the newline.
The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order.