Tony Tony - 29 days ago 32
Python Question

Python - Mean of each value across keys in dict

I am having trouble iterating across an entire dictionary to do simple summary statistics (an average) for each element of a value across keys.

My dictionary consists of keys and values that are lists of numbers:

test_dict={'NJ':[20,50,70,90,100],'NY':[10,3,0,99,57],'CT':[90,1000,2,3.4,5]}


I know that I can access the first value of each key, for instance, by doing the below, but I am having trouble with the obvious next step of adding another for loop to iterate across all elements in the values.

location1=[element[0] for element in test_dict.values()]
location1_avg=sum(location1)/len(location1)


My ultimate goal is to have a dictionary with labels as keys (Location 1...i) and the average value across states for that location. So the first key-value would be Location1: 40, and so on.

I have the below attempt, but the error message is 'list index out of range' and i do not know how to iterate properly in this case.

for element in test_dict.values():
avg=list()
for nums in element[i]:
avg[i]=sum(element[i][nums])/len(element[i][nums])


Adding desired output per requests

soln_dict={'Location1':40,'Location2':351,'Loction3':24,'Loction4':43.24,'Loction5':54}


Thank you for your help!

Answer Source

Just do :

#loop through the dictionary
for key,value in test_dict.items(): 

   #use reduce to calculate the avg
   print(key, reduce(lambda x, y: x + y, test_dict[key]) / len(test_dict[key]))

This will print :

NJ 66.0
NY 33.8
CT 220.08

Edit : As per change in OP requirements :

l = list(iter(test_dict.values()))                      #convert values to list
print(l)
#[[20, 50, 70, 90, 100], [10, 3, 0, 99, 57], [90, 1000, 2, 3.4, 5]]
d={}                                                                  #final ditionary
for i in range(len(l[0])): 
   row_list = [row[i] for row in l]                     #get values column-wise
   d['location'+str(i+1)] = sum(row_list)/len(row_list)               #calculate avg

print(d)
#{'location1': 40.0, 'location2': 351.0, 'location3': 24.0, 'location4': 64.13333333333334, 'location5': 54.0}

Note : the average you have put in question for loaction4 is wrong.