ou_snaaksie ou_snaaksie - 2 months ago 10
Python Question

Strange behavior when appending dictionary to list

I am reading JSON into my script and building a list consisting of dictionaries.

My JSON:

{
"JMF": {
"table1": {
"email": "JMF1@fake.com",
"guests": [
"test1",
"test2"
]
},
"table2": {
"email": "JMF2@fake.com",
"guests": [
"test3"
]
}
},
"JMC": {
"table3": {
"email": "JMC1@fake.com",
"guests": [
"test11"
]
}
},
"JMD": {
"table4": {
"email": "JMD1@fake.com",
"guests": [
"test12"
]
},
"table5": {
"email": "JMD2@fake.com",
"guests": [
"test17"
]
}
}
}


My code:

def get_json():
userinfo_list = []
with open('guest_users.json') as json_file:
json_file = json.load(json_file)
keys = json_file.keys()
for key in keys:
userinfo = {}
for table_key in json_file[key].keys():
email = json_file[key][table_key]['email']
users_dict = {}
users_list = []
for user in json_file[key][table_key]['guests']:
users_dict['username'] = user
users_dict['password'] = generate_password()
users_list.append(users_dict)
userinfo['company'] = key
userinfo['email'] = email
userinfo['userinfo'] = users_list
userinfo_list.append(userinfo)
print(userinfo)
print(userinfo_list)


The problem is that the values in
userinfo_list
get overwritten as soon as my JSON has two sub-keys (
table*
).

This is the output I get, which doesn't make sense:

{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}
[{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}]
{'userinfo': [{'username': 'test17', 'password': 'A8Jue5'}], 'email': 'JMD2@fake.com', 'company': 'JMD'}
[{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}, {'userinfo': [{'username': 'test17', 'password': 'A8Jue5'}], 'email': 'JMD2@fake.com', 'company': 'JMD'}]
{'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}
[{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}]
{'userinfo': [{'username': 'test2', 'password': 'GagQ59'}, {'username': 'test2', 'password': 'GagQ59'}], 'email': 'JMF1@fake.com', 'company': 'JMF'}
[{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}, {'userinfo': [{'username': 'test2', 'password': 'GagQ59'}, {'username': 'test2', 'password': 'GagQ59'}], 'email': 'JMF1@fake.com', 'company': 'JMF'}]
{'userinfo': [{'username': 'test3', 'password': 'U9gP0j'}], 'email': 'JMF2@fake.com', 'company': 'JMF'}
[{'userinfo': [{'username': 'test11', 'password': '1fEAg0'}], 'email': 'JMC1@fake.com', 'company': 'JMC'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}, {'userinfo': [{'username': 'test12', 'password': '0JSpc0'}], 'email': 'JMD1@fake.com', 'company': 'JMD'}, {'userinfo': [{'username': 'test3', 'password': 'U9gP0j'}], 'email': 'JMF2@fake.com', 'company': 'JMF'}, {'userinfo': [{'username': 'test3', 'password': 'U9gP0j'}], 'email': 'JMF2@fake.com', 'company': 'JMF'}]

Answer

You are re-appending the same single dictionary each iteration:

users_dict = {}  # only one copy of this dictionary is ever created
users_list = []
for user in json_file[key][table_key]['guests']:
    users_dict['username'] = user
    users_dict['password'] = generate_password()
    users_list.append(users_dict)  # appending a reference to users_dict

Appending does not create a copy, so you get multiple references to the same dictionary, and you'll only see the last change reflected. You make the same mistake with the userinfo dictionary.

Create a new dictionary in the loop:

users_list = []
for user in json_file[key][table_key]['guests']:
    users_dict = {}
    users_dict['username'] = user
    users_dict['password'] = generate_password()
    users_list.append(users_dict)

You can just specify the key-value pairs directly when creating the dictionary:

users_list = []
for user in json_file[key][table_key]['guests']:
    users_dict = {
        'username': user,
        'password': generate_password()
    }
    users_list.append(users_dict)

and this can be simplified with a list comprehension to:

users_list = [{'username': user, 'password': generate_password()}
              for user in json_file[key][table_key]['guests']]

Note that you don't need to call dict.keys() to loop over a dictionary. You can loop directly over the dictionary with the exact same results. You probably want to loop over .items() instead and avoid having to look up the value for the key each time, and use .values() when you don't actually need the key at all:

userinfo_list = []
for company, db in json_file.items():
    for table in db.values():
        userinfo = {
            'company': company,
            'email': table['email'],
            'userinfo': [
                {'username': user, 'password': generate_password()}
                for user in table['guests']]
        }
        userinfo_list.append(userinfo)

The creation of dictionaries per table per company can also be replaced by a list comprehension, but at this point sticking to nested for loops is probably going to be easier to comprehend for future readers.

The above now produces:

[{'company': 'JMF',
  'email': 'JMF1@fake.com',
  'userinfo': [{'password': 'random_password_really', 'username': 'test1'},
               {'password': 'random_password_really', 'username': 'test2'}]},
 {'company': 'JMF',
  'email': 'JMF2@fake.com',
  'userinfo': [{'password': 'random_password_really', 'username': 'test3'}]},
 {'company': 'JMC',
  'email': 'JMC1@fake.com',
  'userinfo': [{'password': 'random_password_really', 'username': 'test11'}]},
 {'company': 'JMD',
  'email': 'JMD1@fake.com',
  'userinfo': [{'password': 'random_password_really', 'username': 'test12'}]},
 {'company': 'JMD',
  'email': 'JMD2@fake.com',
  'userinfo': [{'password': 'random_password_really', 'username': 'test17'}]}]

from your sample data (and my own definition of generate_password()).