Demetri P - 6 months ago 33

Python Question

From the R Documentation...

bw.nrd0 implements a rule-of-thumb for choosing the bandwidth of a

Gaussian kernel density estimator. It defaults to 0.9 times the

minimum of the standard deviation and the interquartile range divided

by 1.34 times the sample size to the negative one-fifth power (=

Silverman's ‘rule of thumb’, Silverman (1986, page 48, eqn (3.31))

unless the quartiles coincide when a positive result will be

guaranteed.

On an array from 1 to 400 (equivalent to

`np.arange(1,401)`

`def nrd0_python(x):`

X = min(np.std(x), np.percentile(x,25))

top = 0.9*X

bottom = 1.34*len(x)**(-0.2)

return top/bottom

I get upwards of 200 (to be exact, 224.28217762858455)

Are there any known python functions for silverman's rule of thumb?

Answer

The precedence of operators in an English sentence is ambiguous.

Here's an implementation that returns the value you expect:

```
In [201]: x = np.arange(1, 401)
In [202]: 0.9*min(np.std(x, ddof=1), (np.percentile(x, 75) - np.percentile(x, 25))/1.349)*len(x)**(-0.2)
Out[202]: 31.393668650034652
```

I don't have the Silverman reference. I found that in http://darp.lse.ac.uk/papersdb/Cowell-Flachaire_Handbook.pdf (p. 21).