snowleopard snowleopard - 2 months ago 8
Python Question

Splitting numpy array field values that are matrices into column vectors

I have the following numpy structured array:

x = np.array([(22, 2, -1000000000.0, [1000,2000.0]), (22, 2, 400.0, [1000,2000.0])],
dtype=[('f1', '<i4'), ('f2', '<i4'), ('f3', '<f4'), ('f4', '<f4',2)])


As you can see, field 'f4' is a matrix:

In [63]: x['f4']
Out[63]:
array([[ 1000., 2000.],
[ 1000., 2000.]], dtype=float32)


My end goal is to have a numpy structured array that only has vectors. I was wondering how to split 'f4' into two fields ('f41' and 'f42') where each field represents the column of the matrix.

In [67]: x
Out[67]:
array([(22, 2, -1000000000.0, 1000.0, 2000.0),
(22, 2, 400.0, 1000.0, 2000.0)],
dtype=[('f1', '<i4'), ('f2', '<i4'), ('f3', '<f4'), ('f41', '<f4'), ('f42', '<f4')])


Also i was wondering if it was possible to achieve this while using operations that modify the array in place or with minimal copying of the original data.

Answer

You can do this by creating a new view (np.view) of the array, which will not copy:

import numpy as np

x = np.array([(22, 2, -1000000000.0, [1000,2000.0]),
              (22, 2, 400.0, [1000,2000.0])],
             dtype=[('f1', '<i4'),
                    ('f2', '<i4'),
                    ('f3', '<f4'),
                    ('f4', '<f4', 2)])
xNewView = x.view(dtype=[('f1', '<i4'),
                         ('f2', '<i4'),
                         ('f3', '<f4'),
                         ('f41', '<f4'),
                         ('f42', '<f4')])
print(np.may_share_memory(x, xNewView)) # True
print(xNewView)
# array([(22, 2, -1000000000.0, 1000.0, 2000.0),
#        (22, 2, 400.0, 1000.0, 2000.0)], 
#       dtype=[('f1', '<i4'),  ('f2', '<i4'), ('f3', '<f4'),
#              ('f41', '<f4'), ('f42', '<f4')])

print(xNewView['f41'])           # array([ 1000.,  1000.], dtype=float32)
print(xNewView['f42'])           # array([ 2000.,  2000.], dtype=float32)

xNewView can then be used instead of x.