ramu ramu - 1 month ago 30
Python Question

Python : save dictionaries through numpy.save

I have a large data set (millions of rows) in memory, in the form of numpy arrays and dictionaries.

Once this data is constructed I want to store them into files;
so, later I can load these files into memory quickly, without reconstructing this data from the scratch once again.

np.save and np.load functions does the job smoothly for numpy arrays.

But I am facing problems with dict objects.

See below sample. d2 is the dictionary which was loaded from the file. See #out[28] it has been loaded into d2 as a numpy array, not as a dict. So further dict operations such as get are not working.

Is there a way to load the data from the file as dict (instead of numpy array) ?

In [25]: d1={'key1':[5,10], 'key2':[50,100]}

In [26]: np.save("d1.npy", d1)

In [27]: d2=np.load("d1.npy")

In [28]: d2
Out[28]: array({'key2': [50, 100], 'key1': [5, 10]}, dtype=object)

In [30]: d1.get('key1') #original dict before saving into file
Out[30]: [5, 10]

In [31]: d2.get('key2') #dictionary loaded from the file
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-31-23e02e45bf22> in <module>()
----> 1 d2.get('key2')

AttributeError: 'numpy.ndarray' object has no attribute 'get'

Answer

It's a structured array. Use d2.item() to retrieve the actual dict object first:

import numpy as np

d1={'key1':[5,10], 'key2':[50,100]}
np.save("d1.npy", d1)
d2=np.load("d1.npy")
print d1.get('key1')
print d2.item().get('key2')

result:

[5, 10]
[50, 100]