theNamesCross theNamesCross - 4 days ago 5
Python Question

Pythonic alternative to (nested) dictionaries with the same keys?

I find myself avoiding dictionaries because, often, nearly half their code is duplicate. I typically do this in nested dictionaries, where all sub-dictionaries contain the same keys, but different values.

I manually create a large parent dictionary, where each key contains a nested dictionary, to be used in external modules. The nested dictionaries all use the same keys to define configuration parameters. This usage is explicit and works but it feels foolish to retype or copy/paste the keys for each nested dictionary I create manually. I am not overly concerned about optimizing memory or performance, just wondering if I should be doing this another, more Pythonic way.

As a trivial example and pattern often seen:

people_dict = {
"Charles Lindberg": {"address": "123 St.",
"famous": True},
"Me": {"address": "456 St.",
"famous": False}
}

>>>people_dict["Charles Lindberg"]["address"]
"123 St."


While the dictionary enables explicit code, it is tedious and error prone to define nested dictionaries with duplicate keys. In this example, half the nested dictionary is code duplicate code common to all the nested dictionaries.
I have tried using tuples to eliminate duplicate keys but find this leads to fragile code - any change in position (rather than a dictionary key) fails. This also leads to code that is not explicit and hard to follow.

people_dict = {
"Charles Lindberg": ("123 St.", True),
"Me": ("456 St.", False),
}

>>>people_dict["Charles Lindberg"][0]
"123 St."


Instead, I write a class to encapsulate the same information: This
successfully reduces duplicate code...

class Person(object):
def __init__(self, address, famous=False):
self.address = address
self.famous = famous

people_dict = [
"Charles Lindberg": Person("123 St.", famous=False),
"Me": Person("456 St."),
]

>>>people_dict["Charles Lindberg"].address
"123 St."


Creating a class seems a little overkill... The standard data types seem too basic...

I imagine there's better way to do this in Python, without having to write your own class?



What is the best way to avoid duplicate code when creating nested dictionaries with common keys?


Answer

It sounds like you have a matrix of data, since every "row" has the same keys (columns), so I'd use a NumPy array:

import numpy as np

dtype = [('name', object), ('address', object), ('famous', bool)]
people = np.array([
        ("Charles Lindberg", "123 St.", True),
        ("Me", "456 St.", False),
        ], dtype)

charlie = people[people['name'] == 'Charles Lindberg'][0]
print charlie['address']

Or using Pandas which is more high-level:

import pandas as pd
people = pd.DataFrame(people_dict)
print people['Charles Lindberg']['address']

That loads your original dict-of-dicts people_dict straight into the matrix very easily, and gives you similar lookup features.

Comments