askewchan askewchan - 3 months ago 12
Python Question

How to return a view of several columns in numpy structured array

I can see several columns (

fields
) at once in a
numpy
structured array by indexing with a list of the field names, for example

import numpy as np

a = np.array([(1.5, 2.5, (1.0,2.0)), (3.,4.,(4.,5.)), (1.,3.,(2.,6.))],
dtype=[('x',float), ('y',float), ('value',float,(2,2))])

print a[['x','y']]
#[(1.5, 2.5) (3.0, 4.0) (1.0, 3.0)]

print a[['x','y']].dtype
#[('x', '<f4') ('y', '<f4')])


But the problem is that it seems to be a copy rather than a view:

b = a[['x','y']]
b[0] = (9.,9.)

print b
#[(9.0, 9.0) (3.0, 4.0) (1.0, 3.0)]

print a[['x','y']]
#[(1.5, 2.5) (3.0, 4.0) (1.0, 3.0)]


If I only select one column, it's a view:

c = x['y']
c[0] = 99.

print c
#[ 99. 4. 3. ]

print a['y']
#[ 99. 4. 3. ]


Is there any way I can get the view behavior for more than one column at once?

I have two workarounds, one is to just loop through the columns, the other is to create a hierarchical
dtype
, so that the one column actually returns a structured array with the two (or more) fields that I want. Unfortunately,
zip
also returns a copy, so I can't do:

x = a['x']; y = a['y']
z = zip(x,y)
z[0] = (9.,9.)

Answer

You can create a dtype object contains only the fields that you want, and use numpy.ndarray() to create a view of original array:

import numpy as np
strc = np.zeros(3, dtype=[('x', int), ('y', float), ('z', int), ('t', "i8")])

def fields_view(arr, fields):
    dtype2 = np.dtype({name:arr.dtype.fields[name] for name in fields})
    return np.ndarray(arr.shape, dtype2, arr, 0, arr.strides)

v1 = fields_view(strc, ["x", "z"])
v1[0] = 10, 100

v2 = fields_view(strc, ["y", "z"])
v2[1:] = [(3.14, 7)]

v3 = fields_view(strc, ["x", "t"])

v3[1:] = [(1000, 2**16)]

print strc

here is the output:

[(10, 0.0, 100, 0L) (1000, 3.14, 7, 65536L) (1000, 3.14, 7, 65536L)]