astrofrog - 8 months ago 56

Python Question

In a large code base, I am using

`np.broadcast_to`

`In [1]: x = np.array([1,2,3])`

In [2]: y = np.broadcast_to(x, (2,1,3))

In [3]: y.shape

Out[3]: (2, 1, 3)

Elsewhere in the code, I use third-party functions that can operate in a vectorized way on Numpy arrays but that are not ufuncs. These functions don't understand broadcasting, which means that calling such a function on arrays like

`y`

`vectorize`

`for`

Ideally, what I'd like to be able to do is to have a function, which we can call e.g.

`unbroadcast`

`In [4]: z = unbroadcast(y)`

In [5]: z.shape

Out[5]: (1, 1, 3)

I can then run the third-party functions on

`z`

`y.shape`

Is there a way to implement

`unbroadcast`

Answer Source

This is probably equivalent to your own solution, only a bit more built-in. It uses `as_strided`

in `numpy.lib.stride_tricks`

:

```
import numpy as np
from numpy.lib.stride_tricks import as_strided
x = np.arange(16).reshape(2,1,8,1) # shape (2,1,8,1)
y = np.broadcast_to(x,(2,3,8,5)) # shape (2,3,8,5) broadcast
def unbroadcast(arr):
#determine unbroadcast shape
newshape = np.where(np.array(arr.strides) == 0,1,arr.shape) # [2,1,8,1], thanks to @Divakar
return as_strided(arr,shape=newshape) # strides are automatically set here
z = unbroadcast(x)
np.all(z==x) # is True
```

Note that in my original answer I didn't define a function, and the resulting `z`

array had `(64,0,8,0)`

as `strides`

, whereas the input has `(64,64,8,8)`

. In the current version the returned `z`

array has identical strides to `x`

, I guess passing and returning the array forces a creation of a copy. Anyway, we could always set the strides manually in `as_strided`

to get identical arrays under all circumstances, but this doesn't seem necessary in the above setup.