Arpan Das Arpan Das - 4 months ago 55
Python Question

How to compute the median and 68% confidence interval around the median of non-Gaussian distribution in Python?

I have a data set which is a numpy array say a=[a1,a2,.....] and also the weights of the data w=[w1,w2,w3...]. I have computed the histogram using numpy histogram package which gives me the hist array. Now I want to compute the median of this probability distribution function and also the 68% contour around the median. Remember my dataset is not Gaussian.

Can anyone help? I am using python.


Here a solution using scipy.stats.rv_discrete:

import numpy as np, scipy.stats as st

# example data set
a = np.arange(20)
w = a + 1

# create custom discrete random variable from data set
rv = st.rv_discrete(values=(a, w/float(w.sum())))

# scipy.stats.rv_discrete has methods for median, confidence interval, etc.
print("median:", rv.median())
print("68% CI:", rv.interval(0.68))

Output reflects the uneven weights in the example data set:

median: 13.0
68% CI: (7.0, 18.0)