Shaun Aran Shaun Aran - 1 month ago 5
Python Question

Python - process a chunk of lines in a file

I have a file containing

x
number of values each on their own line.
I need to be able to take
n
number of value from this file, put them into an array, pass that array into a new process, clear the array and then take another
n
number of values from the file to give to the next process.

The problem I'm having is when
x
is a value like 12 and I'm trying to give, let's say, 10 chunks of values of each process.

The first process will get it's first 10 values no problem, but I'm having trouble giving the remaining 2 to the last process.

The problem would also arise if, let's say, you tell the program to give each process 10 values from the file, but the file only has 1, or even 9 values.

I need know when I'm at the last set of values that is less than
n


I want to avoid taking every value in the file and storing it in an array all at once since I could run into memory problems if there was millions of values in that file.

Here's an example of what I've tried to do:

chunk = 10
value_list = []
with open ('file.txt', 'r') as f:
for value in f:
value_list.append(value)
if (len(value_list) >= chunk):
print 'Got %d' % len(value_list)
value_list = [] # Clear the list
# Put array into new process


This will catch every 10 in this example, but it wont work if there even happend to be less than 10 in the file to begin with.

Answer

What I typically do in this situation is just handle the last (short) array after the for loop. For example,

chunk = 10
value_list = []
with open ('file.txt', 'r') as f:
    for value in f:
        if (len(value_list) >= chunk):
            print 'Got %d' % len(value_list)
            value_list = [] # Clear the list
            # Put array into new process
        else:
            value_list.append(value)
    # send left overs to new process
    if value_list:
        print 'Got %d' % len(value_list)
        # Put final array into new process
Comments