Peterstone Peterstone - 6 months ago 24
Python Question

Saving an Object (Data persistence in Python)

I've created an object like this:

company1.name = 'banana'
company1.value = 40


I would like to save this object. How can I do that?

Answer

You could use the pickle module in the standard library.
Here's an elementary application of it to your example:

import pickle

class Company(object):
    def __init__(self, name, value):
        self.name = name
        self.value = value

with open('company_data.pkl', 'wb') as output:
    company1 = Company('banana', 40)
    pickle.dump(company1, output, pickle.HIGHEST_PROTOCOL)

    company2 = Company('spam', 42)
    pickle.dump(company2, output, pickle.HIGHEST_PROTOCOL)

del company1
del company2

with open('company_data.pkl', 'rb') as input:
    company1 = pickle.load(input)
    print(company1.name)  # -> banana
    print(company1.value)  # -> 40

    company2 = pickle.load(input)
    print(company2.name) # -> spam
    print(company2.value)  # -> 42

You could also use a simple utility like the following which opens a file and writes a single object to it:

def save_object(obj, filename):
    with open(filename, 'wb') as output:
        pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)

# sample usage
save_object(company1, 'company1.pkl')

Update:

Since this is such a popular answer, I'd like touch on a few slightly advanced usage topics.

First, it's almost always preferable to actually use the cPickle module rather than pickle because the former is written in C and is much faster. There are some subtle differences between them, but in most situations they're equivalent and the C version will provide greatly superior performance. Switching to it couldn't be easier, just change the import statement to this:

import cPickle as pickle

(Note: In Python 3, cPickle was renamed _pickle, but doing this is no longer necessary because the pickle module now does it automatically — see question What difference between pickle and _pickle in python 3?).

Secondly, instead of writing pickle.HIGHEST_PROTOCOL in every call (assuming that's what you want, and you usually do), you can instead just use the literal -1. So, instead of writing:

pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)

You can just write:

pickle.dump(obj, output, -1)

Which is quite a bit shorter.

An even better way where you only have specify the protocol once is to create a Pickler object and use it to do multiple pickle operations:

pickler = pickle.Pickler(output, -1)
pickler.dump(obj1)
pickler.dump(obj2)
   etc...
Comments