Hi I'm relatively new here and trying to do some calculations with numpy. I'm experiencing a long elapse time from one particular calculation and can't work out any faster way to achieve the same thing.
Basically its part of a ray triangle intersection algorithm and I need to calculate all the vector cros products from two matrices of different sizes.
The code I was using was :
allhvals1 = numpy.cross( dirvectors[:,None,:], trivectors2[None,:,:] )
n* vectors (xyz)
If you look at the source code of
np.cross, it basically moves the
xyz dimension to the front of the shape tuple for all arrays, and then has the calculation of each of the components spelled out like this:
x = a*b - a*b y = a*b - a*b z = a*b - a*b
In your case, each of those products requires allocating huge arrays, so the overall behavior is not very efficient.
Lets set up some test data:
u = np.random.rand(1000, 3) v = np.random.rand(2000, 3) In : %timeit s1 = np.cross(u[:, None, :], v[None, :, :]) 1 loops, best of 3: 591 ms per loop
We can try to compute it using Levi-Civita symbols and
np.einsum as follows:
eijk = np.zeros((3, 3, 3)) eijk[0, 1, 2] = eijk[1, 2, 0] = eijk[2, 0, 1] = 1 eijk[0, 2, 1] = eijk[2, 1, 0] = eijk[1, 0, 2] = -1 In : %timeit s2 = np.einsum('ijk,uj,vk->uvi', eijk, u, v) 1 loops, best of 3: 706 ms per loop In : np.allclose(s1, s2) Out: True
So while it works, it has worse performance. The thing is that
np.einsum has trouble when there are more than two operands, but has optimized pathways for two or less. So we can try to rewrite it in two steps, to see if it helps:
In : %timeit s3 = np.einsum('iuk,vk->uvi', np.einsum('ijk,uj->iuk', eijk, u), v) 10 loops, best of 3: 63.4 ms per loop In : np.allclose(s1, s3) Out: True
Bingo! Close to an order of magnitude improvement...
Some performance figures with
einsum is about twice as fast as
cross for the largest