bublitz bublitz - 12 days ago 6
Python Question

Python: Parse string to array

I am currently having the problem parsing a string to a numpy array.

The string look like this:

input = '{{13,1},{2,1},{4,4},{1,7},{9,1}}'


The string represents a sparse vector, where the vector itself is delimited by curly brackets. Each entry, itself delimited by curly brackets, indicates which indices have which entries. The first entry in the list encodes the dimensions of the vector.

In the above example, the vector has length of 13 and 4 entries which are different from 0.

output = np.array([0,7,1,0,4,0,0,0,0,1,0,0,0])


After parsing it to an array, I have to parse to back to a string in its dense format, with the format:

stringoutput = '{0,7,1,0,4,0,0,0,0,1,0,0,0}'


While I managed to parse the numpy array to a string, I ran into the problem of having the wrong brackets (i.e. the build in array2string function uses [], while I need {})

I am open for any suggestions that help, solving this efficiently (even for large sparse vectors).

Thank you.

\edit: The given vector is always one dimensional, i.e. the second number within the first {} will always be 1. (and you only need 1 index to locate the position of elements)

Answer

Here is a numpythonic way:

In [132]: inp = '{{13,1},{2,1},{4,4},{1,7},{9,1}}'

# Relace the brackets with parenthesis in order to convert the string to a valid python object.
In [133]: inp = ast.literal_eval(inp.replace('{', '(').replace('}', ')'))
# Unpack the dimention and rest of then values from input object
In [134]: dim, *rest = inp
# Creat the zero array based on extracted dimention 
In [135]: arr = np.zeros(dim)
# use `zip` to collecte teh indices and values separately in order to be use in `np.put`
In [136]: indices, values = zip(*rest)

In [137]: np.put(arr, indices, values)

In [138]: arr
Out[138]: 
array([[ 0.],
       [ 7.],
       [ 1.],
       [ 0.],
       [ 4.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 1.],
       [ 0.],
       [ 0.],
       [ 0.]])
Comments