snowleopard snowleopard - 21 days ago 6
Python Question

View of numpy structured array with offsets

I have the following numpy structured array:

In [250]: x
Out[250]:
array([(22, 2, -1000000000, 2000), (22, 2, 400, 2000),
(22, 2, 804846, 2000), (44, 2, 800, 4000), (55, 5, 900, 5000),
(55, 5, 1000, 5000), (55, 5, 8900, 5000), (55, 5, 11400, 5000),
(33, 3, 14500, 3000), (33, 3, 40550, 3000), (33, 3, 40990, 3000),
(33, 3, 44400, 3000)],
dtype=[('f1', '<i4'), ('f2', '<i4'), ('f3', '<i4'), ('f4', '<i4')])


The array below is a subset (also a view) of the above array:

In [251]: fields=['f1','f3']

In [252]: y=x.getfield(np.dtype(
...: {name: x.dtype.fields[name] for name in fields}
...: ))

In [253]: y
Out[253]:
array([(22, -1000000000), (22, 400), (22, 804846), (44, 800), (55, 900),
(55, 1000), (55, 8900), (55, 11400), (33, 14500), (33, 40550),
(33, 40990), (33, 44400)],
dtype={'names':['f1','f3'], 'formats':['<i4','<i4'], 'offsets':[0,8], 'itemsize':12})


I am trying to convert y to a regular numpy array. I want the array to be a view. The issue is that the following gives me an error:

In [254]: y.view(('<i4',2))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-254-88440f106a89> in <module>()
----> 1 y.view(('<i4',2))

C:\numpy\core\_internal.pyc in _view_is_safe(oldtype, newtype)
499
500 # raises if there is a problem
--> 501 _check_field_overlap(new_fieldtile, old_fieldtile)
502
503 # Given a string containing a PEP 3118 format specifier,

C:\numpy\core\_internal.pyc in _check_field_overlap(new_fields, old_fields)
402 old_bytes.update(set(range(off, off+tp.itemsize)))
403 if new_bytes.difference(old_bytes):
--> 404 raise TypeError("view would access data parent array doesn't own")
405
406 #next check that we do not interpret non-Objects as Objects, and vv

TypeError: view would access data parent array doesn't own


However, if i choose consecutive fields it works:

In [255]: fields=['f1','f2']
...:
...: y=x.getfield(np.dtype(
...: {name: x.dtype.fields[name] for name in fields}
...: ))
...:

In [256]: y
Out[256]:
array([(22, 2), (22, 2), (22, 2), (44, 2), (55, 5), (55, 5), (55, 5),
(55, 5), (33, 3), (33, 3), (33, 3), (33, 3)],
dtype=[('f1', '<i4'), ('f2', '<i4')])

In [257]: y.view(('<i4',2))
Out[257]:
array([[22, 2],
[22, 2],
[22, 2],
[44, 2],
[55, 5],
[55, 5],
[55, 5],
[55, 5],
[33, 3],
[33, 3],
[33, 3],
[33, 3]])


View casting seems to not work when the fields are not contiguous, is there an alternative?

Answer

Yes, use the ndarray constructor directly:

x = np.array([(22, 2, -1000000000, 2000), 
              (22, 2,         400, 2000),
              (22, 2,      804846, 2000), 
              (44, 2,         800, 4000), 
              (55, 5,         900, 5000), 
              (55, 5,        1000, 5000)], 
             dtype=[('f1','i'),('f2','i'),('f3','i'),('f4','i')])

fields = ['f4', 'f1']
shape = x.shape + (len(fields),)
offsets = [x.dtype.fields[name][1] for name in fields]
assert not any(np.diff(offsets, 2))
strides = x.strides + (offsets[1] - offsets[0],)
y = np.ndarray(shape=shape, dtype='i', buffer=x,
               offset=offsets[0], strides=strides)
print repr(y)

Gives:

array([[2000,   22],
       [2000,   22],
       [2000,   22],
       [4000,   44],
       [5000,   55],
       [5000,   55]])