bcf - 1 year ago 207

Python Question

Consider a sum of

`n`

`S = sum (Z^2(mu, sig^2))`

`S / sig^2`

`n`

`n*mu^2`

However, compare generating

`N`

`S`

`N`

`scipy.ncx2`

`import numpy as np`

from scipy.stats import ncx2, chi2

import matplotlib.pyplot as plt

n = 1000 # number of normals in sum

N_MC = 100000 # number of trials

mu = 0.05

sig = 0.3

### Generate sums of squared normals ###

Z = np.random.normal(loc=mu, scale=sig, size=(N_MC, n))

S = np.sum(Z**2, axis=1)

### Generate non-central chi2 RVs directly ###

dof = n

non_centrality = n*mu**2

NCX2 = sig**2 * ncx2.rvs(dof, non_centrality, size=N_MC)

# NCX2 = sig**2 * chi2.rvs(dof, size=N_MC) # for mu = 0.0

### Plot histos ###

fig, ax = plt.subplots()

ax.hist(S, bins=50, label='S')

ax.hist(NCX2, bins=50, label='NCX2', alpha=0.7)

ax.legend()

plt.show()

This results in the histograms

I believe the mathematics is correct; could the discrepancy be a bug in the

`ncx2`

`mu = 0`

`scipy.chi2`

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

The problem is in the second sentence of the question: *" S / sig^2 has a noncentral chi-squared distribution with degrees of freedom = n and non-centrality parameter = n*mu^2."* That non-centrality parameter is not correct. It should be

`n*(mu/sig)^2`

.The standard definition of the noncentral chi-squared distribution is that it is the sum of the squares of normal variates that have mean mu and *standard deviation 1*. You are computing `S`

using normal variates with standard deviation `sig`

. Let's write that distribution as `N(mu, sig**2)`

. By using the location-scale properties of the normal distribution, we have

```
N(mu, sig**2) = mu + sig*N(0, 1) = sig*(mu/sig + N(0,1)) = sig*N(mu/sig, 1)
```

So summing the squares of variates from `N(mu, sig**2)`

is equivalent to summing the squares of `sig*N(mu/sig, 1)`

. That gives `sig**2`

times a noncentral chi-squared variate with noncentrality `mu/sig`

.

If you change the line where `non_centrality`

is computed to

```
non_centrality = n*(mu/sig)**2
```

the histograms line up as you expect.

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**