steeeve steeeve - 7 days ago 6
Python Question

Pythonic method to evaluate averages over a huge data set

I have a vector

X
that is composed of
N = 10e6
values. I would like to compute the mean for the increasing pairs. For example:

for i in range(0,N-1):
Ex[i] = X[0:i+1].mean()


This is a terribly inefficient way of doing this. What would be a more intelligent algorithm for Python? Note
Ex
and
X
are both numpy arrays of float values.

Answer

A numpy-centric solution could be as follows:

X = np.random.rand(10**6)
EX = np.cumsum(X) / np.arange(1, X.shape[0]+1)
Comments