Alexander McFarlane Alexander McFarlane - 1 year ago 165
Python Question

Apply a function to the 0-dimension of an ndarray


  • I have an
    , defined by
    that is an
    -dimensional cube with length
    in each dimension.

  • I want to act a function,
    , by slicing along the dimension
    and taking each
    -dim slice as an input to the function.

This seems to work for
but I can't find a
variant that is appropriate.
seems to split the
-tensor into individual scalar entries. Neither
seem appropriate either.

My problem is such that I need to pass arbitrary functions as inputs so I do not see a solution with
being feasible either.


  • Do you know the best
    alternative to using
    np.asarray(map(func, arr))


I define an example array,
as a
-dim cube (or 4-tensor) by:

m, n = 3, 4
arr = np.arange(m**n).reshape((m,)*n)

I define an example function

def f(x):
"""makes it obvious how the np.ndarray is being passed into the function"""
try: # perform an op using x[0,0,0] which is expected to exist
i = x[0,0,0]
print '\nno element x[0,0,0] in x: \n{}'.format(x)
return np.nan
return x-x+i

The expected result,
, from this function would remain the same shape but would satisfy the following:

print all([(res[i] == i*m**(n-1)).all() for i in range(m)])

This works with the default

res = np.asarray(map(f, a))
print all([(res[i] == i*m**(n-1)).all() for i in range(m)])

I would expect
to work in the same way as
but it acts in scalar entries:

res = np.vectorize(f)(a)

no element x[0,0,0] in x:

Answer Source

Given that arr is 4d, and your fn works on 3d arrays,

np.asarray(map(func, arr))

looks perfectly reasonable. I'd use the list comprehension form, but that's a matter of programming style

np.asarray([func(i) for i in arr])

for i in arr iterates on the first dimension of arr. In effect it treats arr as a list of the 3d arrays. And then it reassembles the resulting list into a 4d array.

np.vectorize doc could be more explicit about the function taking scalars. But yes, it passes values as scalars. Note that np.vectorize does not have provision for passing an iteration axis parameter. It's most useful when your function takes values from several array, something like

 [func(a,b) for a,b in zip(arrA, arrB)]

It generalizes the zip so allow for broadcasting. But otherwise it is an iterative solution. It knows nothing about the guts of your func, so it can't speed up its calls.

np.vectorize ends up calling np.frompyfunc, which being a bit less general is a bit faster. But it too passes scalars to the func.

np.apply_along/over_ax(e/i)s also iterate over one or more axes. You may find their code instructive, but I agree they don't apply here.

A variation on the map approach is to allocate the result array, and index:

In [45]: res=np.zeros_like(arr,int)
In [46]: for i in range(arr.shape[0]):
    ...:     res[i,...] = f(arr[i,...])

This may be easier if you need to iterate on a different axis than the 1st.

You need to do your own timings to see which is faster.


An example of iteration over the 1st dimension with in-place modification:

In [58]: arr.__array_interface__['data']  # data buffer address
Out[58]: (152720784, False)

In [59]: for i,a in enumerate(arr):
    ...:     print(a.__array_interface__['data'])
    ...:     a[0,0,:]=i
(152720784, False)   # address of the views (same buffer)
(152720892, False)
(152721000, False)

In [60]: arr
array([[[[ 0,  0,  0],
         [ 3,  4,  5],
         [ 6,  7,  8]],


       [[[ 1,  1,  1],
         [30, 31, 32],

       [[[ 2,  2,  2],
         [57, 58, 59],
         [60, 61, 62]],

When I iterate over an array, I get a view that starts at successive points on the common data buffer. If I modify the view, as above or even with a[:]=..., I modify the original. I don't have to write anything back. But don't use a = ...., which breaks the link to the original array.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download