Carlton Banks Carlton Banks - 1 month ago 6
Python Question

store as string as a list of number

I in python looking for a method that is capable of converting a string which could (different length) look like something like this. The string contains datapoints extracted from special file out of a 100 different files.

[ 52.61236 -3.144785 -11.27863 -7.346569 11.27105 -2.408065 -10.35697 -15.61926 -2.353437 3.109831 -9.151857 -18.2364 -10.63264
56.55116 -9.186506 -13.75657 5.94078 5.96905 -1.483013 -4.259864 -12.57798 -0.9668677 -10.42454 13.35543 -9.19768 -14.42702
56.55116 -12.68435 -22.5432 12.2574 -1.100278 1.703274 6.538071 -3.648291 -1.31351 -6.617892 7.883823 1.777809 -20.30247
58.31721 -15.8642 -29.1799 -2.507436 9.061881 -8.988363 -2.703156 -9.803705 3.01952 -5.810421 -11.41331 6.092502 -14.42702
54.18788 -21.26995 -13.06826 -1.524487 7.294549 -6.622187 -6.594927 -7.723001 -0.4469042 -10.07848 1.881792 4.235661 2.776365
51.8246 -24.13182 1.875062 -0.08787012 -0.6584454 -6.10827 -5.686847 -17.57512 -2.70008 -7.425363 3.382299 2.605522 1.001098
47.8858 -16.81816 -5.772095 -7.346569 -0.2166124 -9.863102 -7.11383 -10.75736 -2.006795 -5.233656 2.45891 0.7017822 0.5999346 ]


to a list of numbers of numbers. I am doing this in python so i guess there must be a handy library somewhere i can use of? but which one?

Answer

It is important to know where this rather strange kind of string is coming from, but if this is something you have and you cannot change it, here are several different approches.

Option 1

Pre-process the string by replacing the spaces between the numbers with commas and use ast.literal_eval() to load into a Python list:

In [1]: s = """ [ 52.61236 -3.144785 -11.27863 -7.346569 11.27105 -2.408065 -10.35697 -15.61926 -2.353437 3.109831 -
   ...: 9.151857 -18.2364 -10.63264 
   ...:   56.55116 -9.186506 -13.75657 5.94078 5.96905 -1.483013 -4.259864 -12.57798 -0.9668677 -10.42454 13.35543 -
   ...: 9.19768 -14.42702 
   ...:   56.55116 -12.68435 -22.5432 12.2574 -1.100278 1.703274 6.538071 -3.648291 -1.31351 -6.617892 7.883823 1.77
   ...: 7809 -20.30247 
   ...:   58.31721 -15.8642 -29.1799 -2.507436 9.061881 -8.988363 -2.703156 -9.803705 3.01952 -5.810421 -11.41331 6.
   ...: 092502 -14.42702 
   ...:   54.18788 -21.26995 -13.06826 -1.524487 7.294549 -6.622187 -6.594927 -7.723001 -0.4469042 -10.07848 1.88179
   ...: 2 4.235661 2.776365 
   ...:   51.8246 -24.13182 1.875062 -0.08787012 -0.6584454 -6.10827 -5.686847 -17.57512 -2.70008 -7.425363 3.382299
   ...:  2.605522 1.001098 
   ...:   47.8858 -16.81816 -5.772095 -7.346569 -0.2166124 -9.863102 -7.11383 -10.75736 -2.006795 -5.233656 2.45891 
   ...: 0.7017822 0.5999346 ]"""

In [2]: from ast import literal_eval

In [3]: import re

In [4]: l = literal_eval(re.sub(r"(?<=\d)\s+(?=[\-\d])", ",", s.strip()))

In [5]: l
Out[5]: 
[52.61236,
 -3.144785,
 -11.27863,
 ...
 2.45891,
 0.7017822,
 0.5999346]

Option 2 (probably the easiest)

strip() the square brackets, split() by space and use float() to convert individual substrings to floats:

In [13]: [float(item) for item in s.strip(" []").split()]
Out[13]: 
[52.61236,
 -3.144785,
 -11.27863,
 ...
2.45891,
 0.7017822,
 0.5999346]

Option 3

Use re.findall() to find all substrings with dashes, digits and dots, then "map" float function to get a list (in Python 2) of floats:

In [14]: map(float, re.findall(r"[\d\-.]+", s))
Out[14]: 
[52.61236,
 -3.144785,
 -11.27863,
...
 2.45891,
 0.7017822,
 0.5999346]
Comments