mvsrs mvsrs - 1 year ago 183
Python Question

Fit data to all possible distributions and return the best fit

I have a sample data and I want to get the best fit distribution. I have got couple of links which suggest that I can import the distributions from

, but then I am not aware of the type of data before hand. I want something similar to
which tries to fit data to around 20 distributions and returns the best fit.

Link for

Any help is highly appreciable. Thanks.

Answer Source

You can just create a list of all available distributions in scipy. An example with two distributions and random data:

import numpy as np
import scipy.stats as st

data = np.random.random(10000)
distributions = [st.laplace, st.norm]
mles = []

for distribution in distributions:
    pars =
    mle = distribution.nnlf(pars, data)

results = [(, mle) for distribution, mle in zip(distributions, mles)]
best_fit = sorted(zip(distributions, mles), key=lambda d: d[1])[0]
print 'Best fit reached using {}, MLE value: {}'.format(best_fit[0].name, best_fit[1])