minks - 3 months ago 24

Python Question

If I have a list like this:

`results=[-14.82381293 -0.29423447 -13.56067979 -1.6288903 -0.31632439`

0.53459687 -1.34069996 -1.61042692 -4.03220519 -0.24332097]

I want to calculate the variance of this list in Python.

Variance = The average of the squared differences from the mean.

How can I go about this? Accessing the elements in the list to do the computations is confusing me for getting the square differences.

Answer

Just use numpy's built-in function var (and add commas to your list):

```
import numpy as np
results = [-14.82381293, -0.29423447, -13.56067979, -1.6288903, -0.31632439,
0.53459687, -1.34069996, -1.61042692, -4.03220519, -0.24332097]
print np.var(results)
```

This gives you `28.822364260579157`

If - for whatever reason - you cannot use numpy and/or you don't want to use a built-in function for it, you can also calculate it by hand using e.g. a list comprehension:

```
# calculate mean
m = sum(results) / len(results)
# calculate variance using a list comprehension
varRes = sum([(xi - m)**2 for xi in results]) / len(results)
```

which gives you the identical result.

**EDIT**

@Serge Ballesta explained very well the difference between variance `n`

and `n-1`

. In numpy you can easily set this parameter using the option `ddof`

; its default is 0, so for the `n-1`

case you can simply do:

```
np.var(results, ddof=1)
```

The "by hand" solution would be:

```
sum([(xi - m)**2 for xi in results]) / (len(results) - 1)
```

Both approaches give you `32.024849178421285`

.