384X21 384X21 - 4 months ago 7
Python Question

How to read large file, line by line in python

I want to iterate over each line of an entire file. One way to do this is by reading the entire file, saving it to a list, then going over the line of interest. This method uses a lot of memory, so I am looking for an alternative.

My code so far:

for each_line in fileinput.input(input_file):
do_something(each_line)

for each_line_again in fileinput.input(input_file):
do_something(each_line_again)


Executing this code gives an error message:
device active
.

Any suggestions?

EDIT: The purpose is to calculate pair-wise string similarity, meaning for each line in file, I want to calculate the Levenshtein distance with every other line.

Answer

Nobody has given the correct, fully Pythonic way to read a file. It's the following:

with open(...) as f:
    for line in f:
        <do something with line>

The with statement handles opening and closing the file, including if an exception is raised in the inner block. The for line in f treats the file object f as an iterable, which automatically uses buffered IO and memory management so you don't have to worry about large files.

There should be one -- and preferably only one -- obvious way to do it.