user40314 user40314 - 2 months ago 9
Python Question

Behavior of ndarray.data for views in numpy

I am trying to understand the meaning of

ndarray.data
field in numpy (see memory layout section of the reference page on N-dimensional arrays), especially for views into arrays. To quote the documentation:


ndarray.data -- Python buffer object pointing to the start of the array’s data


According to this description, I was expecting this to be a pointer to the C-array underlying the instance of ndarray.

Consider
x = np.arange(5, dtype=np.float64)
.

Form
y
as a view into
x
using a slice:
y = x[3:1:-1]
.

I was expecting
x.data
to point at location of
0.
and
y.data
to point at the location of
3.
. I was expecting the memory pointer printed by
y.data
to thus be offset by
3*x.itemsize
bytes from the memory pointer printed by
x.data
, but this does not appear to be the case:

>>> import numpy as np
>>> x = np.arange(5, dtype=np.float64)
>>> y = x[ 3:1:-1]
>>> x.data
<memory at 0x000000F2F5150348>
>>> y.data
<memory at 0x000000F2F5150408>
>>> int('0x000000F2F5150408', 16) - int('0x000000F2F5150348', 16)
192
>>> 3*x.itemsize
24


The
'data'
key in
__array_interface
dictionary associated with the ndarray instance behaves more like I expect, although it may itself not be a pointer:

>>> y.__array_interface__['data'][0] - x.__array_interface__['data'][0]
24



So this begs the question, what does the
ndarray.data
give?


Thanks in advance.

Answer

Generally the number displayed by x.data isn't meant to be used by you. x.data is the buffer, which can be used in other contexts that expect a buffer.

np.frombuffer(x.data,dtype=float)

replicates your x.

np.frombuffer(x[3:].data,dtype=float)

this replicates x[3:]. But from Python you can't take x.data, add 192 bits (3*8*8) to it, and expect to get x[3:].

I often use the __array_interface__['data'] value to check whether two variables share a data buffer, but I don't use that number for any thing. These are informative numbers, not working values.

I recently explored this in

Creating a NumPy array directly from __array_interface__