garak garak - 14 days ago 5
Python Question

Save classifier to disk in scikit-learn

How do I save a trained Naive Bayes classifier to disk and use it to predict data?

I have the following sample program from the scikit-learn website:

from sklearn import datasets
iris = datasets.load_iris()
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
y_pred = gnb.fit(iris.data, iris.target).predict(iris.data)
print "Number of mislabeled points : %d" % (iris.target != y_pred).sum()

mwv mwv
Answer

Classifiers are just objects that can be pickled and dumped like any other. To continue your example:

import cPickle
# save the classifier
with open('my_dumped_classifier.pkl', 'wb') as fid:
    cPickle.dump(gnb, fid)    

# load it again
with open('my_dumped_classifier.pkl', 'rb') as fid:
    gnb_loaded = cPickle.load(fid)