I have a matrix
M1
(N*2)
M2
(2*N)
(N)
i
i
M1
i
M2
(N*N)
Approach #1
You can use np.einsum

np.einsum('ij,ji>i',M1,M2)
Explanation :
The original loopy solution would look something like this 
def original_app(M1,M2):
N = M1.shape[0]
out = np.zeros(N)
for i in range(N):
out[i] = M1[i].dot(M2[:,i])
return out
Thus, for each iteration, we have :
out[i] = M1[i].dot(M2[:,i])
Looking at the iterator, we need to align the first axis of M1
with the second axis of M2
. Again, since we are performing matrixmultiplication
and that by its very definition is aligning the second axis of M1
with the first axis of M2
and also sumreducing these elements at each iteration.
When porting over to einsum
, keep the axes to be aligned between the two inputs to have the same string when specifying the string notation to it. So, the inputs would be 'ij,ji
for M1
and M2
respectively. The output after losing the second string from M1
, which is same as first string from M2
in that sumreduction, should be left as i
. Thus, the complete string notation would be : 'ij,ji>i'
and the final solution as : np.einsum('ij,ji>i',M1,M2)
.
Approach #2
The number of cols in M1
or number of rows in M2
is 2
. So, alternatively, we can just slice, perform the elementwise multiplication and sum up those, like so 
M1[:,0]*M2[0] + M1[:,1]*M2[1]
Runtime test
In [431]: # Setup inputs
...: N = 1000
...: M1 = np.random.rand(N,2)
...: M2 = np.random.rand(2,N)
...:
In [432]: np.allclose(original_app(M1,M2),np.einsum('ij,ji>i',M1,M2))
Out[432]: True
In [433]: np.allclose(original_app(M1,M2),M1[:,0]*M2[0] + M1[:,1]*M2[1])
Out[433]: True
In [434]: %timeit original_app(M1,M2)
100 loops, best of 3: 2.09 ms per loop
In [435]: %timeit np.einsum('ij,ji>i',M1,M2)
100000 loops, best of 3: 13 µs per loop
In [436]: %timeit M1[:,0]*M2[0] + M1[:,1]*M2[1]
100000 loops, best of 3: 14.2 µs per loop
Massive speedup there!