Alexander McFarlane - 9 months ago 29

Python Question

- I have an , defined by
`ndarray`

that is an`arr`

-dimensional cube with length`n`

in each dimension.`m`

- I want to act a function, , by slicing along the dimension
`func`

and taking each`n=0`

-dim slice as an input to the function.`n-1`

This seems to work for

`map()`

`numpy`

`np.vectorise`

`n-1`

`apply_along_axis`

`apply_over_axes`

My problem is such that I need to pass arbitrary functions as inputs so I do not see a solution with

`einsum`

- Do you know the best alternative to using
`numpy`

?`np.asarray(map(func, arr))`

I define an example array,

`arr`

`4`

`m, n = 3, 4`

arr = np.arange(m**n).reshape((m,)*n)

I define an example function

`f`

`def f(x):`

"""makes it obvious how the np.ndarray is being passed into the function"""

try: # perform an op using x[0,0,0] which is expected to exist

i = x[0,0,0]

except:

print '\nno element x[0,0,0] in x: \n{}'.format(x)

return np.nan

return x-x+i

The expected result,

`res`

`print all([(res[i] == i*m**(n-1)).all() for i in range(m)])`

This works with the default

`map()`

`res = np.asarray(map(f, a))`

print all([(res[i] == i*m**(n-1)).all() for i in range(m)])

True

I would expect

`np.vectorize`

`map()`

`res = np.vectorize(f)(a)`

no element x[0,0,0] in x:

0

...

Answer

Given that `arr`

is 4d, and your `fn`

works on 3d arrays,

```
np.asarray(map(func, arr))
```

looks perfectly reasonable. I'd use the list comprehension form, but that's a matter of programming style

```
np.asarray([func(i) for i in arr])
```

`for i in arr`

iterates on the first dimension of `arr`

. In effect it treats `arr`

as a list of the 3d arrays. And then it reassembles the resulting list into a 4d array.

`np.vectorize`

doc could be more explicit about the function taking scalars. But yes, it passes values as scalars. Note that `np.vectorize`

does not have provision for passing an iteration axis parameter. It's most useful when your function takes values from several array, something like

```
[func(a,b) for a,b in zip(arrA, arrB)]
```

It generalizes the `zip`

so allow for broadcasting. But otherwise it is an iterative solution. It knows nothing about the guts of your `func`

, so it can't speed up its calls.

`np.vectorize`

ends up calling `np.frompyfunc`

, which being a bit less general is a bit faster. But it too passes scalars to the func.

`np.apply_along/over_ax(e/i)s`

also iterate over one or more axes. You may find their code instructive, but I agree they don't apply here.

A variation on the map approach is to allocate the result array, and index:

```
In [45]: res=np.zeros_like(arr,int)
In [46]: for i in range(arr.shape[0]):
...: res[i,...] = f(arr[i,...])
```

This may be easier if you need to iterate on a different axis than the 1st.

You need to do your own timings to see which is faster.

========================

An example of iteration over the 1st dimension with in-place modification:

```
In [58]: arr.__array_interface__['data'] # data buffer address
Out[58]: (152720784, False)
In [59]: for i,a in enumerate(arr):
...: print(a.__array_interface__['data'])
...: a[0,0,:]=i
...:
(152720784, False) # address of the views (same buffer)
(152720892, False)
(152721000, False)
In [60]: arr
Out[60]:
array([[[[ 0, 0, 0],
[ 3, 4, 5],
[ 6, 7, 8]],
...
[[[ 1, 1, 1],
[30, 31, 32],
...
[[[ 2, 2, 2],
[57, 58, 59],
[60, 61, 62]],
...]]])
```

When I iterate over an array, I get a view that starts at successive points on the common data buffer. If I modify the view, as above or even with `a[:]=...`

, I modify the original. I don't have to write anything back. But don't use `a = ....`

, which breaks the link to the original array.