- 2 months ago 4
Python Question

How to make a dictionary the value component or a numpy array

I am a new Python 2.7 user. I recently learned about numpy arrays, and now I am now just learning about dictionaries. Please excuse me if my syntax is not correct.

Let's say we have a dictionary:

dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
'Bob': {'dogs': '5', 'cats': '6'},
'Chris': {'dogs': '7', 'cats': '8'},
'Dan': {'dogs': '9', 'cats': '10'}}

The keys are
and the values are the numbers of each Ann, Bob, Chris, and Dan have.

I want to inverse the value component of my dictionary. I know I can convert to a list by using
, and then convert to an array, and then convert back to a dictionary, but this seems tedious. Is there a way to make my value component a numpy array and leave the key component the way it is?


Based on your question and comments I think you just want the same dictionary structure, but with the numbers inverted:

dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
         'Bob': {'dogs': '5', 'cats': '6'},
         'Chris': {'dogs': '7', 'cats': '8'},
         'Dan': {'dogs': '9', 'cats': '10'}}

for k in dict1.keys():
    value = dict1[k]
    for k1 in value.keys():
        value[k1] = 1/float(value[k1])

{'Ann': {'cats': 0.25, 'dogs': 0.3333333333333333},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.2},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.1, 'dogs': 0.1111111111111111}}

I modified the dictionary in place, just replacing the numeric strings with their inverse, e.g. '4' with 0.25.

Iterating on two levels of keys() is in a sense, tedious, but it's the straight forward thing to do when working with nested dictionaries. I wrote the for expression in one trial - no errors. I am experienced, but still I usually have to try several things before getting something that works. I iterated on keys so I could easily change the values in place. If I wanted to make a copy, I probably could have written it as a nested dict comprehension, but it would be more obscure.

Provided it does the right thing, it's faster than anything involving numpy or pandas. Creating the arrays takes time.


A numpy approach - much more advanced coding (display from a ipython session):

In [65]: dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
    ...:          'Bob': {'dogs': '5', 'cats': '6'},
    ...:          'Chris': {'dogs': '7', 'cats': '8'},
    ...:          'Dan': {'dogs': '9', 'cats': '10'}}

In [66]: dt = np.dtype([('name','U5'),('dogs',float),('cats',float)])
# define a structured array dtype.

In [67]: def foo(k,v):
    ...:     return (k, v['dogs'], v['cats'])
# define a helper function - just helps organize my thoughts better 

In [68]: alist=[foo(k,v) for k,v in dict1.items()]

In [69]: alist
Out[69]: [('Chris', '7', '8'), ('Bob', '5', '6'), ('Dan', '9', '10'), ('Ann', '3', '4')]
# this is a list of tuples - a critical format for the next step    

In [70]: arr = np.array(alist, dtype=dt)

In [71]: arr
array([('Chris', 7.0, 8.0), 
       ('Bob', 5.0, 6.0), 
       ('Dan', 9.0, 10.0),
       ('Ann', 3.0, 4.0)], 
      dtype=[('name', '<U5'), ('dogs', '<f8'), ('cats', '<f8')])

I've converted the dictionary to a structured array, with 3 fields. This is similar to what I'd get from reading a csv file like:

name, dogs, cats
Ann, 3, 4
Bob, 5, 6

The dogs and cats fields are numeric, so I can invert their values

In [72]: arr['dogs']=1/arr['dogs']
In [73]: arr['cats']=1/arr['cats']

In [74]: arr
array([('Chris', 0.14285714285714285, 0.125),
       ('Bob', 0.2, 0.16666666666666666), 
       ('Dan', 0.1111111111111111, 0.1),
       ('Ann', 0.3333333333333333, 0.25)], 
      dtype=[('name', '<U5'), ('dogs', '<f8'), ('cats', '<f8')])

The result is the same numbers as in the dictionary case, but in a table layout.


A dictionary comprehension version - same double dictionary iteration as the first solution, but building a new dictionary rather than making changes in place:

In [78]: {k1:{k2:1/float(v2) for k2,v2 in v1.items()} for k1,v1 in dict1.items()}
{'Ann': {'cats': 0.25, 'dogs': 0.3333333333333333},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.2},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.1, 'dogs': 0.1111111111111111}}


When the numeric values are in an array, it is possible to take the numeric inverse of all the values at once. That's the beauty of numpy. But getting there can require some advance numpy coding.

For example I could take the 2 numeric fields of arr, and view them as a 2d array:

In [80]: arr[['dogs','cats']].view('(2,)float')
array([[ 0.14285714,  0.125     ],
       [ 0.2       ,  0.16666667],
       [ 0.11111111,  0.1       ],
       [ 0.33333333,  0.25      ]])

In [81]: 1/arr[['dogs','cats']].view('(2,)float')
array([[  7.,   8.],
       [  5.,   6.],
       [  9.,  10.],
       [  3.,   4.]])

Getting back the original numbers (without the name labels).