Alexander McFarlane - 1 year ago 165
Python Question

# Apply a function to the 0-dimension of an ndarray

## Problem

• I have an
`ndarray`
, defined by
`arr`
that is an
`n`
-dimensional cube with length
`m`
in each dimension.

• I want to act a function,
`func`
, by slicing along the dimension
`n=0`
and taking each
`n-1`
-dim slice as an input to the function.

This seems to work for
`map()`
but I can't find a
`numpy`
variant that is appropriate.
`np.vectorise`
seems to split the
`n-1`
-tensor into individual scalar entries. Neither
`apply_along_axis`
or
`apply_over_axes`
seem appropriate either.

My problem is such that I need to pass arbitrary functions as inputs so I do not see a solution with
`einsum`
being feasible either.

## Question

• Do you know the best
`numpy`
alternative to using
`np.asarray(map(func, arr))`
?

## Example

I define an example array,
`arr`
as a
`4`
-dim cube (or 4-tensor) by:

``````m, n = 3, 4
arr = np.arange(m**n).reshape((m,)*n)
``````

I define an example function
`f`
,

``````def f(x):
"""makes it obvious how the np.ndarray is being passed into the function"""
try: # perform an op using x[0,0,0] which is expected to exist
i = x[0,0,0]
except:
print '\nno element x[0,0,0] in x: \n{}'.format(x)
return np.nan
return x-x+i
``````

The expected result,
`res`
, from this function would remain the same shape but would satisfy the following:

``````print all([(res[i] == i*m**(n-1)).all() for i in range(m)])
``````

This works with the default
`map()`
function,

``````res = np.asarray(map(f, a))
print all([(res[i] == i*m**(n-1)).all() for i in range(m)])
True
``````

I would expect
`np.vectorize`
to work in the same way as
`map()`
but it acts in scalar entries:

``````res = np.vectorize(f)(a)

no element x[0,0,0] in x:
0
...
``````

Given that `arr` is 4d, and your `fn` works on 3d arrays,

``````np.asarray(map(func, arr))
``````

looks perfectly reasonable. I'd use the list comprehension form, but that's a matter of programming style

``````np.asarray([func(i) for i in arr])
``````

`for i in arr` iterates on the first dimension of `arr`. In effect it treats `arr` as a list of the 3d arrays. And then it reassembles the resulting list into a 4d array.

`np.vectorize` doc could be more explicit about the function taking scalars. But yes, it passes values as scalars. Note that `np.vectorize` does not have provision for passing an iteration axis parameter. It's most useful when your function takes values from several array, something like

`````` [func(a,b) for a,b in zip(arrA, arrB)]
``````

It generalizes the `zip` so allow for broadcasting. But otherwise it is an iterative solution. It knows nothing about the guts of your `func`, so it can't speed up its calls.

`np.vectorize` ends up calling `np.frompyfunc`, which being a bit less general is a bit faster. But it too passes scalars to the func.

`np.apply_along/over_ax(e/i)s` also iterate over one or more axes. You may find their code instructive, but I agree they don't apply here.

A variation on the map approach is to allocate the result array, and index:

``````In [45]: res=np.zeros_like(arr,int)
In [46]: for i in range(arr.shape[0]):
...:     res[i,...] = f(arr[i,...])
``````

This may be easier if you need to iterate on a different axis than the 1st.

You need to do your own timings to see which is faster.

========================

An example of iteration over the 1st dimension with in-place modification:

``````In [58]: arr.__array_interface__['data']  # data buffer address
Out[58]: (152720784, False)

In [59]: for i,a in enumerate(arr):
...:     print(a.__array_interface__['data'])
...:     a[0,0,:]=i
...:
(152720784, False)   # address of the views (same buffer)
(152720892, False)
(152721000, False)

In [60]: arr
Out[60]:
array([[[[ 0,  0,  0],
[ 3,  4,  5],
[ 6,  7,  8]],

...

[[[ 1,  1,  1],
[30, 31, 32],
...

[[[ 2,  2,  2],
[57, 58, 59],
[60, 61, 62]],
...]]])
``````

When I iterate over an array, I get a view that starts at successive points on the common data buffer. If I modify the view, as above or even with `a[:]=...`, I modify the original. I don't have to write anything back. But don't use `a = ....`, which breaks the link to the original array.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download