Ginger Ginger - 6 months ago 28
Python Question

Numpy mean AND variance from single function?

Using Numpy/Python, is it possible to return the mean AND variance from a single function call?

I know that I can do them separately, but the mean is required to calculate the sample standard deviation. So if I use separate functions to get the mean and variance I am adding unnecesary overhead.

I have tried looking at the numpy docs here (http://docs.scipy.org/doc/numpy/reference/routines.statistics.html), but with no success.

Answer

You can't pass a known mean to np.std or np.var, you'll have to wait for the new standard library statistics module, but in the meantime you can save a little time by using the formula:

In [329]: a = np.random.rand(1000)

In [330]: %%timeit
   .....: a.mean()
   .....: a.var()
   .....: 
10000 loops, best of 3: 80.6 µs per loop

In [331]: %%timeit
   .....: m = a.mean()
   .....: np.mean((a-m)**2)
   .....: 
10000 loops, best of 3: 60.9 µs per loop

In [332]: m = a.mean()

In [333]: a.var()
Out[333]: 0.078365856465916137

In [334]: np.mean((a-m)**2)
Out[334]: 0.078365856465916137

If you really are trying to speed things up, try np.dot to do the squaring and summing (since that's what a dot-product is):

In [335]: np.dot(a-m,a-m)/a.size
Out[335]: 0.078365856465916137

In [336]: %%timeit
   .....: m = a.mean()
   .....: c = a-m
   .....: np.dot(c,c)/a.size
   .....: 
10000 loops, best of 3: 38.2 µs per loop
Comments