I am working in Python and I have a matrix stored in a text file. The text file is arranged in such a format:
There is a fast and memory efficient way of handling such matrices: using the sparse matrices offered by SciPy (which is the de facto standard in Python for this kind of things).
For a matrix of size
from scipy.sparse import lil_matrix result = lil_matrix((N, N)) # In order to save memory, one may add: dtype=bool, or dtype=numpy.int8 with open('matrix.csv') as input_file: for line in input_file: x, y = map(int, line.split(',', 1)) # The "1" is only here to speed the splitting up result[x, y] = 1
(or, in one line instead of two:
result[map(int, line.split(',', 1))] = 1).
1 given to
split() is just here to speed things up when parsing the coordinates: it instructs Python to stop parsing the line when the first (and only) comma is found. This can matter some, since you are reading a 1 GB file.
Depending on your needs, you might find one of the other six sparse matrix representations offered by SciPy to be better suited.
If you want a faster but also more memory-consuming array, you can use
result = numpy.array(…) (with NumPy) instead.