mvsrs mvsrs - 1 year ago 168
Python Question

Fit data to all possible distributions and return the best fit

I have a sample data and I want to get the best fit distribution. I have got couple of links which suggest that I can import the distributions from

, but then I am not aware of the type of data before hand. I want something similar to
which tries to fit data to around 20 distributions and returns the best fit.

Link for

Any help is highly appreciable. Thanks.


You can just create a list of all available distributions in scipy. An example with two distributions and random data:

import numpy as np
import scipy.stats as st

data = np.random.random(10000)
distributions = [st.laplace, st.norm]
mles = []

for distribution in distributions:
    pars =
    mle = distribution.nnlf(pars, data)

results = [(, mle) for distribution, mle in zip(distributions, mles)]
best_fit = sorted(zip(distributions, mles), key=lambda d: d[1])[0]
print 'Best fit reached using {}, MLE value: {}'.format(best_fit[0].name, best_fit[1])