user2789194 user2789194 - 1 year ago 58
Python Question

No binary operators for structured arrays in Numpy?

Okay, so after going through the tutorials on numpy's structured arrays I am able to create some simple examples:

from numpy import array, ones
names=['scalar', '1d-array', '2d-array']
formats=['float64', '(3,)float64', '(2,2)float64']
my_dtype = dict(names=names, formats=formats)
struct_array1 = ones(1, dtype=my_dtype)
struct_array2 = array([(42., [0., 1., 2.], [[5., 6.],[4., 3.]])], dtype=my_dtype)

(My intended use case would have more than three entries and would use very long 1d-arrays.) So, all goes well until we try to perform some basic math. I get errors for all of the following:

struct_array1 + struct_array2
struct_array1 * struct_array2
1.0 + struct_array1
2.0 * struct_array2

Apparently, simple operators (+, -, *, /) are not supported for even the simplest structured arrays. Or am I missing something? Should I be looking at some other package (and don't say Pandas, because it is total overkill for this)? This seems like an obvious capability, so I'm a little dumbfounded. But it's difficult to find any chatter about this on the net. Doesn't this severely limit the usefulness of structured arrays? Why would anyone use a structure array rather than arrays packed into a dict? Is there a technical reason why this might be intractable? Or, if the correct solution is to perform the arduous work of overloading, then how is that done while keeping the operations fast?

Answer Source

Another way to operate on the whole array is to use the 'union' dtype described in the documentation. In your example, you could expand your dtype by adding a 'union' field, and specifying overlapping 'offsets':

from numpy import array, ones, zeros

names=['scalar', '1d-array', '2d-array', 'union']
formats=['float64', '(3,)float64', '(2,2)float64', '(8,)float64']
offsets=[0, 8, 32, 0]
my_dtype = dict(names=names, formats=formats, offsets=offsets)
struct_array3=zeros((4,), dtype=my_dtype)

['union'] now gives access to all the data as a (n,8) array

struct_array3['union'] # == struct_array3.view('(8,)f8')
struct_array3['union'].shape  # (4,8)

You can operate on 'union' or any other fields:

struct_array3['union'] += 2
struct_array3['scalar']= 1

The 'union' field could another compatible shape, such as '(2,4)float64'. A 'row' of such an array might look like:

array([ (3.0, [0.0, 0.0, 0.0], [[2.0, 2.0], [0.0, 0.0]], 
      [[3.0, 0.0, 0.0, 0.0], [2.0, 2.0, 0.0, 0.0]])], 
             'formats':['<f8',('<f8', (3,)),('<f8', (2, 2)),('<f8', (2, 4))],