Mike Vandenberg - 1 month ago 11
Python Question

# Broadcasting in Python with permutations

I understand that

`transpose`
on an
`ndarray`
is intended to be the equivalent of matlab's
`permute`
function however I have a specific usecase that doesn't work simply. In matlab I have the following:

``````C = @bsxfun(@times, permute(A,[4,2,5,1,3]), permute(B, [1,6,2,7,3,4,5])
``````

where A is a 3D tensor of shape NxNxM and B is a 5D tensor of shape NxNxMxPxP. The above function is meant to vectorize looped kronecker products. I'm assuming that Matlab is adding 2 singleton dimensions for both A and B which is why it's able to rearrange them. I'm looking to port this code over to Python but I don't think it has the capability of adding these extra dimensions.. I found this which successfully adds the extra dimensions however the broadcasting is not working the same matlab's
`bsxfun`
. I have attempted the obvious translation (yes I am using numpy for these
`ndarray`
's and functions):

``````A = A[...,None,None]
B = B[...,None,None]
C = transpose(A,[3,1,4,0,2])*transpose(B,[0,5,1,6,2,3,4])
``````

and I get the following error:

``````return transpose(axes)
ValueError: axes don't match array
``````

My first guess is to do a
`reshape`
on A and B to add in those singleton dimensions?

I now get the following error:

``````mults = transpose(rho_in,[3,1,4,0,2])*transpose(proj,[0,5,1,6,2,3,4])
ValueError: operands could not be broadcast together with shapes (1,9,1,9,8) (9,1,9,1,8,40,40)
``````

The huge difference between MATLAB and numpy is that the former uses column-major format for its arrays, while the latter row-major. The corollary is that implicit singleton dimensions are handled differently.

Specifically, MATLAB explicitly ignores trailing singleton dimensions: `rand(3,3,1,1,1,1,1)` is actually a `3x3` matrix. Along these lines, you can use `bsxfun` to operate on two arrays if their leading dimensions match: `NxNxM` is implicitly `NxNxMx1x1` which is compatible with `NxNxMxPxP`.

Numpy, on the other hand, allows implicit singletons up front. You need to `permute` your arrays in a way that their trailing dimensions match up, for instance shape `(40,40,9,1,9,1,8)` with shape `(1,9,1,9,8)`, and the result should be of shape `(40,40,9,9,9,9,8)`.

Dummy example:

``````>>> import numpy as np
>>> (np.random.rand(40,40,9,1,9,1,8)+np.random.rand(1,9,1,9,8)).shape
(40, 40, 9, 9, 9, 9, 8)
``````

Note that what you're trying to do can probably be done using `numpy.einsum`. I suggest taking a closer look at that. An example of what I mean: from your question I gathered that you want to perform this: take elements `A[1:N,1:N,1:M]` and `B[1:N,1:N,1:M,1:P,1:P]` and construct a new array `C[1:N,1:N,1:N,1:N,1:M,1:P,1:P]` such that

``````C[i1,i2,i3,i4,i5,i6,i7] = A[i2,i4,i5]*B[i1,i3,i5,i6,i7]
``````

If this is correct, you can indeed use `numpy.einsum()`:

``````>>> a = np.random.rand(3,3,2)
>>> b = np.random.rand(3,3,2,4,4)
>>> np.einsum('ijk,lmkno->limjkno',a,b).shape
(3, 3, 3, 3, 2, 4, 4)
``````

Two things should be noted, though. Firstly, the above operation will be very memory-intense, which should be expected for cases of vectorization (where you usually win CPU time at the expense of memory need). Secondly, you should seriously consider rearranging the data model when you port your code. The reason that broadcasting works differently in the two languages is intricately connected to the column-major/row-major difference. This also implies that in MATLAB you should work with leading indices first, since `A(:,i2,i3)` corresponds to a contiguous block of memory, while `A(i1,i2,:)` does not. Conversely, in numpy `A[i1,i2,:]` is contiguous, while `A[:,i2,i3]` is not.

These considerations suggest that you should set up the logistics of your data such that vectorized operations preferably work with leading indices in MATLAB and trailing indices in numpy. You could still use `numpy.einsum` to perform the operation itself, however your dimensions should be in a different (possibly reverse) order compared to MATLAB, at least if we assume that both versions of the code use an optimal setup.

Source (Stackoverflow)