For xray diffraction one needs to find the solutions to the so called LaueEquation
G_hkl  k_in + k_in*(sin(theta) cos(phi) , sin(theta) sin(phi) , cos(theta))=0
where G_hkl is a given 3 dimensional vector, k_in can be chosen as (0,0,1) and theta and phi are free parameters to fulfill the equation. In a typical experiment G_hkl is then rotated around the xaxis and for every step in the rotation one needs to find the solution to the equation. The equation cannot have more than one solution at a given rotation.
I have written this python script to find these solution but for my application it is not fast enough.
import numpy as np
import time
# Just initialization of variables. This is fast enough
res_list = []
N_rot = 100
Ghkl = np.array([0,0.7,0.7])
Ghkl_norm = np.linalg.norm(Ghkl)
kin = np.array([0,0,1])
kin_norm = np.linalg.norm(kin)
alpha_step = 2*np.pi/(N_rot1)
rot_mat = np.array([[1,0,0],[0,np.cos(alpha_step),np.sin(alpha_step)],[0,np.sin(alpha_step),np.cos(alpha_step)]])
# You can deduce theta from the norm of the vector equation
theta = np.arccos(1(Ghkl_norm**2/(2*kin_norm**2)))
sint = np.sin(theta)
cost = np.cos(theta)
# This leaves only phi as paramter to find
# I find phi by introducing a finite test vector
# and testing whether the norm of the vector equation is close
# to zero for any of those test phis
phi_test = np.linspace(0,2*np.pi,200)
kout = kin_norm * np.array([sint * np.cos(phi_test), sint * np.sin(phi_test), cost + 0*phi_test]).T
##############
start_time = time.time()
for j in range(100): # just to make it longer to measure the time
res_list = []
for i in range(N_rot): # This loop is too slow
# Here the norm of the vector equation is calculated for all phi_test
norm_vec = np.linalg.norm(Ghkl[np.newaxis, :]  kin[np.newaxis, :] + kout, axis=1)
if (norm_vec < 0.01 * kin_norm).any(): # check whether any fulfills the criterion
minarg = np.argmin(norm_vec)
res_list.append([theta, phi_test[minarg]])
Ghkl = np.dot(rot_mat,Ghkl)
print('Time was {0:1.2f} s'.format( (time.time()start_time)))
# On my machine it takes 0.3s and res_list should be
# [[1.0356115365192968, 1.578689775673263]]
There's a dependency as Ghkl
is being updated at each iteration and reused at the next. The corresponding closed form might be difficult to trace out. So, I would focus on improving the performance on the rest of the code inside that innermost loop.
Now, there we are calculating norm_vec
, which I think could be sped up with two methods as listed next.
Appproach #1 Using Scipy's cdist

from scipy.spatial.distance import cdist
norm_vec = cdist(kout,(kinGhkl)[None]) # Rest of code stays the same
Appproach #2 Using np.einsum

sums = (Ghklkin)+kout
norm_vec_sq = np.einsum('ij,ij>i',sums,sums)
if (norm_vec_sq < (0.01 * kin_norm)**2 ).any():
minarg = np.argmin(norm_vec_sq) # Rest of code stays the same
Runtime test 
Using the inputs listed in the question, we have these results :
In [91]: %timeit np.linalg.norm(Ghkl kin + kout, axis=1)
10000 loops, best of 3: 31.1 µs per loop
In [92]: sums = (Ghklkin)+kout
...: norm_vec_sq = np.einsum('ij,ij>i',sums,sums)
...:
In [93]: %timeit (Ghklkin)+kout # Approach2  step1
100000 loops, best of 3: 7.09 µs per loop
In [94]: %timeit np.einsum('ij,ij>i',sums,sums) # Approach2  step2
100000 loops, best of 3: 3.82 µs per loop
In [95]: %timeit cdist(kout,(kinGhkl)[None]) # Approach1
10000 loops, best of 3: 44.1 µs per loop
So, approach #1 isn't showing any improvement for those input sizes, but approach #2 is getting around 3x
speedup there on calculating norm_vec
.
Short note on why np.einsum
is beneficial here : Well, einsum
is great for elementwise multiplication and sumreduction. We are exploiting that very nature of problem here. np.linalg.norm
gives us summations along the second axis on the squared version of the input. So, the einsum
counterpart would be to just feed the same input twice as the two inputs to it, thus taking care of the squaring and then lose the second axis with its sumreduction that is expressed with its string notation. It does both these in one go and that's probably why it's so fast here.