Tianyang Li Tianyang Li - 5 months ago 19
Python Question

Python: performance comparison of using `pickle` or `marshal` and using `re`

I am calculating some very large numbers using Python, and I'd like to store previously calculated results in Berkeley DB.

The problem is that Berkeley DB has to use strings, and I have to store an integer tuple for the calculation results.

For example, I get

(m, n)
as my result, one way is to store this as
"%d,%d" % (m, n)
and read it out using
re
. I can also store the tuple using
pickle
or
marshal
.

Which has the better performance?

Answer

For pure speed, marshal will get you the fastest results.

Timings:

>>> timeit.timeit("pickle.dumps([1,2,3])","import pickle",number=10000)
0.2939901351928711
>>> timeit.timeit("json.dumps([1,2,3])","import json",number=10000)
0.09756112098693848
>>> timeit.timeit("pickle.dumps([1,2,3])","import cPickle as pickle",number=10000)
0.031056880950927734
>>> timeit.timeit("marshal.dumps([1,2,3])","import marshal", number=10000)
0.00703883171081543
Comments