Demetri P Demetri P - 4 months ago 22
Python Question

Is there a scipy/numpy alternative to R's nrd0?

From the R Documentation...


bw.nrd0 implements a rule-of-thumb for choosing the bandwidth of a
Gaussian kernel density estimator. It defaults to 0.9 times the
minimum of the standard deviation and the interquartile range divided
by 1.34 times the sample size to the negative one-fifth power (=
Silverman's ‘rule of thumb’, Silverman (1986, page 48, eqn (3.31))
unless the quartiles coincide when a positive result will be
guaranteed.


On an array from 1 to 400 (equivalent to
np.arange(1,401)
), nrd0 will return 31.39367. When I try to implement something similar in python...

def nrd0_python(x):

X = min(np.std(x), np.percentile(x,25))

top = 0.9*X
bottom = 1.34*len(x)**(-0.2)

return top/bottom


I get upwards of 200 (to be exact, 224.28217762858455)

Are there any known python functions for silverman's rule of thumb?

Answer

The precedence of operators in an English sentence is ambiguous.

Here's an implementation that returns the value you expect:

In [201]: x = np.arange(1, 401)

In [202]: 0.9*min(np.std(x, ddof=1), (np.percentile(x, 75) - np.percentile(x, 25))/1.349)*len(x)**(-0.2)
Out[202]: 31.393668650034652

I don't have the Silverman reference. I found that in http://darp.lse.ac.uk/papersdb/Cowell-Flachaire_Handbook.pdf (p. 21).

Comments