Konrad Gje Konrad Gje - 1 year ago 40
Python Question

Spliting a string list to floats - memory error

I can't handle with splitting a list of strings to a list of floats. I open a text file (.vol, but it just contains text) which last line is an extremely long line of numbers.

Params1 437288.59375000 102574.20312500 -83.30001831

Params2 437871.93750000 104981.40625000 362.10000610

Params3 0.00000000

Params4 342 1416 262

Params5 583.34375000 2407.19995117 445.40002441

Params6 20.00000000

Params7 1.70000005

Params8 126879264


0.25564435 0.462439 0.1365 0.1367 26.00000000 (etc., there are millions of values)

Since it's a 10th line of a txt file, I load them into a list by:

with open('d:/py/LAS21_test.vol') as f:
txt = []
for line in f:

And then I try to convert that from string to floats by:

A = []
for vx in txt[9]:
except ValueError:
print (A[0:20])
print (txt[9][0:20])

This give me that results:

[0.0, 2.0, 5.0, 5.0, 6.0, 4.0, 4.0, 3.0, 5.0, 0.0, 4.0, 6.0, 2.0, 4.0, 3.0, 9.0, 0.0, 1.0, 3.0, 6.0]
0.25564435 0.462439

What I would like to have is a list of correctly split floats, like:

[0.25564435, 0.462439]

I used
except ValueError
to omit whitespaces - when using just
I get value error.
Second issue: I can't use
because then I get the 'memory error'.

How can I convert this to a list of floats properly?

Answer Source

Your problem here (as mentioned in the comments) is:

1) In the first case you are trying to index into the string before you split it and you are attempting to convert spaces to floats (hence the ValueError)

2) In the second case there are probably too many numbers in the list and you are filling up your memory with a huge list of large strings (hence the MemoryError).

Try this first:

numbers = [float(s) for s in txt[9].split(' ')]

This should give you a list of numbers. If this also causes a MemoryError, you will have to iterate over the string by hand:

numbers = []
numstring = []
for s in txt[9]
    # Reached a space
    if s == ' ':
        numstring = []

This will be much slower but will save on memory.