Blabber Blabber - 4 months ago 10
Python Question

Complicated data structure union

I have two lists of the following form:

lst1 = [{u'Hours': [{u'HourOfDay': 15, u'TotalVisits': 1, u'TotalTimeSpent': 7223.0}, {u'HourOfDay': 11, u'TotalVisits': 1, u'TotalTimeSpent': 4503.0}, {u'HourOfDay': 18, u'TotalVisits': 1, u'TotalTimeSpent': 6273.0}, {u'HourOfDay': 12, u'TotalVisits': 3, u'TotalTimeSpent': 37550.0}], u'DayOfWeek': 1}, {u'Hours': [{u'HourOfDay': 9, u'TotalVisits': 50, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 19, u'TotalVisits': 1, u'TotalTimeSpent': 7925.0}, {u'HourOfDay': 14, u'TotalVisits': 1, u'TotalTimeSpent': 4873.0}, {u'HourOfDay': 0, u'TotalVisits': 5, u'TotalTimeSpent': 10210.0}, {u'HourOfDay': 11, u'TotalVisits': 4, u'TotalTimeSpent': 21825.0}, {u'HourOfDay': 8, u'TotalVisits': 19, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 9, u'TotalVisits': 8, u'TotalTimeSpent': 13055.0}, {u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 2816.0}, {u'HourOfDay': 11, u'TotalVisits': 2, u'TotalTimeSpent': 15723.0}, {u'HourOfDay': 13, u'TotalVisits': 9, u'TotalTimeSpent': 30987.0}, {u'HourOfDay': 10, u'TotalVisits': 11, u'TotalTimeSpent': 68384.0}, {u'HourOfDay': 15, u'TotalVisits': 26, u'TotalTimeSpent': 62650.0}, {u'HourOfDay': 12, u'TotalVisits': 3, u'TotalTimeSpent': 8626.0}, {u'HourOfDay': 17, u'TotalVisits': 2, u'TotalTimeSpent': 34456.0}, {u'HourOfDay': 16, u'TotalVisits': 5, u'TotalTimeSpent': 6915.0}, {u'HourOfDay': 15, u'TotalVisits': 5, u'TotalTimeSpent': 3151.0}, {u'HourOfDay': 3, u'TotalVisits': 8, u'TotalTimeSpent': 54720.0}, {u'HourOfDay': 19, u'TotalVisits': 6, u'TotalTimeSpent': 23497.0}], u'DayOfWeek': 2}, {u'Hours': [{u'HourOfDay': 11, u'TotalVisits': 56, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 4418.0}, {u'HourOfDay': 11, u'TotalVisits': 4, u'TotalTimeSpent': 9952.0}, {u'HourOfDay': 9, u'TotalVisits': 5, u'TotalTimeSpent': 12678.0}, {u'HourOfDay': 10, u'TotalVisits': 11, u'TotalTimeSpent': 89911.0}, {u'HourOfDay': 22, u'TotalVisits': 2, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 14, u'TotalVisits': 1, u'TotalTimeSpent': 1593.0}, {u'HourOfDay': 12, u'TotalVisits': 10, u'TotalTimeSpent': 36453.0}], u'DayOfWeek': 3}, {u'Hours': [{u'HourOfDay': 13, u'TotalVisits': 22, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 13, u'TotalVisits': 3, u'TotalTimeSpent': 4800.0}, {u'HourOfDay': 12, u'TotalVisits': 1, u'TotalTimeSpent': 212.0}, {u'HourOfDay': 10, u'TotalVisits': 4, u'TotalTimeSpent': 13503.0}, {u'HourOfDay': 14, u'TotalVisits': 7, u'TotalTimeSpent': 19533.0}, {u'HourOfDay': 11, u'TotalVisits': 5, u'TotalTimeSpent': 17512.0}, {u'HourOfDay': 22, u'TotalVisits': 14, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 6121.0}, {u'HourOfDay': 9, u'TotalVisits': 1, u'TotalTimeSpent': 5455.0}, {u'HourOfDay': 15, u'TotalVisits': 7, u'TotalTimeSpent': 21476.0}], u'DayOfWeek': 4}, {u'Hours': [{u'HourOfDay': 15, u'TotalVisits': 4, u'TotalTimeSpent': 12697.0}, {u'HourOfDay': 11, u'TotalVisits': 4, u'TotalTimeSpent': 10656.0}, {u'HourOfDay': 9, u'TotalVisits': 5, u'TotalTimeSpent': 11879.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 924.0}, {u'HourOfDay': 19, u'TotalVisits': 1, u'TotalTimeSpent': 8075.0}, {u'HourOfDay': 10, u'TotalVisits': 20, u'TotalTimeSpent': 15478.0}, {u'HourOfDay': 12, u'TotalVisits': 6, u'TotalTimeSpent': 24608.0}, {u'HourOfDay': 14, u'TotalVisits': 8, u'TotalTimeSpent': 12858.0}, {u'HourOfDay': 13, u'TotalVisits': 1, u'TotalTimeSpent': 2545.0}], u'DayOfWeek': 5}, {u'Hours': [{u'HourOfDay': 19, u'TotalVisits': 1, u'TotalTimeSpent': 426.0}, {u'HourOfDay': 14, u'TotalVisits': 1, u'TotalTimeSpent': 1528.0}, {u'HourOfDay': 12, u'TotalVisits': 4, u'TotalTimeSpent': 11558.0}, {u'HourOfDay': 4, u'TotalVisits': 4, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 771.0}, {u'HourOfDay': 13, u'TotalVisits': 4, u'TotalTimeSpent': 2449.0}, {u'HourOfDay': 15, u'TotalVisits': 17, u'TotalTimeSpent': 67983.0}, {u'HourOfDay': 11, u'TotalVisits': 5, u'TotalTimeSpent': 1452.0}, {u'HourOfDay': 10, u'TotalVisits': 3, u'TotalTimeSpent': 5075.0}, {u'HourOfDay': 8, u'TotalVisits': 3, u'TotalTimeSpent': 4769.0}, {u'HourOfDay': 9, u'TotalVisits': 14, u'TotalTimeSpent': 49453.0}, {u'HourOfDay': 10, u'TotalVisits': 1, u'TotalTimeSpent': 0.0}], u'DayOfWeek': 6}, {u'Hours': [{u'HourOfDay': 17, u'TotalVisits': 8, u'TotalTimeSpent': 0.0}], u'DayOfWeek': 7}]
lst2 = [{u'Hours': [{u'HourOfDay': 15, u'TotalVisits': 2, u'TotalTimeSpent': 425.0}, {u'HourOfDay': 13, u'TotalVisits': 1, u'TotalTimeSpent': 730.0}, {u'HourOfDay': 16, u'TotalVisits': 3, u'TotalTimeSpent': 70.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 240.0}, {u'HourOfDay': 10, u'TotalVisits': 2, u'TotalTimeSpent': 295.0}, {u'HourOfDay': 2, u'TotalVisits': 2, u'TotalTimeSpent': 1572.0}, {u'HourOfDay': 11, u'TotalVisits': 10, u'TotalTimeSpent': 1856.0}, {u'HourOfDay': 18, u'TotalVisits': 2, u'TotalTimeSpent': 232.0}, {u'HourOfDay': 12, u'TotalVisits': 3, u'TotalTimeSpent': 115.0}, {u'HourOfDay': 23, u'TotalVisits': 7, u'TotalTimeSpent': 1409.0}, {u'HourOfDay': 22, u'TotalVisits': 6, u'TotalTimeSpent': 1364.0}, {u'HourOfDay': 14, u'TotalVisits': 1, u'TotalTimeSpent': 5.0}, {u'HourOfDay': 19, u'TotalVisits': 1, u'TotalTimeSpent': 12.0}, {u'HourOfDay': 3, u'TotalVisits': 2, u'TotalTimeSpent': 127.0}, {u'HourOfDay': 12, u'TotalVisits': 2, u'TotalTimeSpent': 107.0}], u'DayOfWeek': 1}, {u'Hours': [{u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 9.0}, {u'HourOfDay': 12, u'TotalVisits': 1, u'TotalTimeSpent': 77.0}, {u'HourOfDay': 10, u'TotalVisits': 2, u'TotalTimeSpent': 16.0}, {u'HourOfDay': 6, u'TotalVisits': 1, u'TotalTimeSpent': 37.0}, {u'HourOfDay': 14, u'TotalVisits': 5, u'TotalTimeSpent': 956.0}, {u'HourOfDay': 9, u'TotalVisits': 1, u'TotalTimeSpent': 787.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 27.0}, {u'HourOfDay': 18, u'TotalVisits': 2, u'TotalTimeSpent': 24.0}, {u'HourOfDay': 19, u'TotalVisits': 7, u'TotalTimeSpent': 1123.0}, {u'HourOfDay': 18, u'TotalVisits': 2, u'TotalTimeSpent': 108.0}, {u'HourOfDay': 9, u'TotalVisits': 2, u'TotalTimeSpent': 39.0}, {u'HourOfDay': 17, u'TotalVisits': 1, u'TotalTimeSpent': 28.0}, {u'HourOfDay': 22, u'TotalVisits': 2, u'TotalTimeSpent': 117.0}, {u'HourOfDay': 13, u'TotalVisits': 2, u'TotalTimeSpent': 65.0}, {u'HourOfDay': 21, u'TotalVisits': 4, u'TotalTimeSpent': 870.0}, {u'HourOfDay': 20, u'TotalVisits': 2, u'TotalTimeSpent': 42.0}, {u'HourOfDay': 11, u'TotalVisits': 1, u'TotalTimeSpent': 10.0}, {u'HourOfDay': 23, u'TotalVisits': 2, u'TotalTimeSpent': 98.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 3.0}, {u'HourOfDay': 7, u'TotalVisits': 1, u'TotalTimeSpent': 14.0}], u'DayOfWeek': 2}, {u'Hours': [{u'HourOfDay': 7, u'TotalVisits': 1, u'TotalTimeSpent': 21.0}, {u'HourOfDay': 2, u'TotalVisits': 2, u'TotalTimeSpent': 11.0}, {u'HourOfDay': 22, u'TotalVisits': 1, u'TotalTimeSpent': 13.0}, {u'HourOfDay': 10, u'TotalVisits': 1, u'TotalTimeSpent': 4.0}, {u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 4.0}, {u'HourOfDay': 13, u'TotalVisits': 1, u'TotalTimeSpent': 112.0}, {u'HourOfDay': 19, u'TotalVisits': 2, u'TotalTimeSpent': 148.0}, {u'HourOfDay': 11, u'TotalVisits': 2, u'TotalTimeSpent': 10.0}, {u'HourOfDay': 20, u'TotalVisits': 1, u'TotalTimeSpent': 15.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 85.0}, {u'HourOfDay': 0, u'TotalVisits': 10, u'TotalTimeSpent': 1634.0}], u'DayOfWeek': 3}, {u'Hours': [{u'HourOfDay': 22, u'TotalVisits': 4, u'TotalTimeSpent': 1747.0}, {u'HourOfDay': 16, u'TotalVisits': 1, u'TotalTimeSpent': 3.0}, {u'HourOfDay': 23, u'TotalVisits': 2, u'TotalTimeSpent': 123.0}, {u'HourOfDay': 14, u'TotalVisits': 1, u'TotalTimeSpent': 28.0}, {u'HourOfDay': 0, u'TotalVisits': 1, u'TotalTimeSpent': 261.0}, {u'HourOfDay': 20, u'TotalVisits': 6, u'TotalTimeSpent': 22.0}, {u'HourOfDay': 11, u'TotalVisits': 1, u'TotalTimeSpent': 5.0}, {u'HourOfDay': 19, u'TotalVisits': 2, u'TotalTimeSpent': 131.0}, {u'HourOfDay': 8, u'TotalVisits': 10, u'TotalTimeSpent': 719.0}, {u'HourOfDay': 17, u'TotalVisits': 1, u'TotalTimeSpent': 2.0}], u'DayOfWeek': 4}, {u'Hours': [{u'HourOfDay': 21, u'TotalVisits': 5, u'TotalTimeSpent': 550.0}, {u'HourOfDay': 9, u'TotalVisits': 1, u'TotalTimeSpent': 0.0}, {u'HourOfDay': 11, u'TotalVisits': 1, u'TotalTimeSpent': 12.0}, {u'HourOfDay': 0, u'TotalVisits': 2, u'TotalTimeSpent': 58.0}, {u'HourOfDay': 17, u'TotalVisits': 1, u'TotalTimeSpent': 95.0}, {u'HourOfDay': 7, u'TotalVisits': 1, u'TotalTimeSpent': 841.0}, {u'HourOfDay': 22, u'TotalVisits': 9, u'TotalTimeSpent': 276.0}, {u'HourOfDay': 14, u'TotalVisits': 2, u'TotalTimeSpent': 129.0}, {u'HourOfDay': 8, u'TotalVisits': 2, u'TotalTimeSpent': 80.0}, {u'HourOfDay': 15, u'TotalVisits': 4, u'TotalTimeSpent': 98.0}, {u'HourOfDay': 11, u'TotalVisits': 3, u'TotalTimeSpent': 35.0}, {u'HourOfDay': 19, u'TotalVisits': 3, u'TotalTimeSpent': 119.0}, {u'HourOfDay': 23, u'TotalVisits': 2, u'TotalTimeSpent': 32.0}, {u'HourOfDay': 20, u'TotalVisits': 3, u'TotalTimeSpent': 322.0}, {u'HourOfDay': 21, u'TotalVisits': 1, u'TotalTimeSpent': 0.0}], u'DayOfWeek': 5}, {u'Hours': [{u'HourOfDay': 17, u'TotalVisits': 1, u'TotalTimeSpent': 8.0}, {u'HourOfDay': 18, u'TotalVisits': 4, u'TotalTimeSpent': 496.0}, {u'HourOfDay': 12, u'TotalVisits': 1, u'TotalTimeSpent': 8.0}, {u'HourOfDay': 0, u'TotalVisits': 1, u'TotalTimeSpent': 10.0}, {u'HourOfDay': 21, u'TotalVisits': 4, u'TotalTimeSpent': 222.0}, {u'HourOfDay': 20, u'TotalVisits': 6, u'TotalTimeSpent': 196.0}, {u'HourOfDay': 16, u'TotalVisits': 3, u'TotalTimeSpent': 98.0}, {u'HourOfDay': 2, u'TotalVisits': 4, u'TotalTimeSpent': 201.0}, {u'HourOfDay': 22, u'TotalVisits': 4, u'TotalTimeSpent': 653.0}, {u'HourOfDay': 10, u'TotalVisits': 2, u'TotalTimeSpent': 16.0}, {u'HourOfDay': 15, u'TotalVisits': 2, u'TotalTimeSpent': 92.0}, {u'HourOfDay': 23, u'TotalVisits': 1, u'TotalTimeSpent': 29.0}, {u'HourOfDay': 23, u'TotalVisits': 3, u'TotalTimeSpent': 182.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 430.0}, {u'HourOfDay': 8, u'TotalVisits': 1, u'TotalTimeSpent': 548.0}, {u'HourOfDay': 17, u'TotalVisits': 2, u'TotalTimeSpent': 40.0}], u'DayOfWeek': 6}, {u'Hours': [{u'HourOfDay': 22, u'TotalVisits': 2, u'TotalTimeSpent': 70.0}, {u'HourOfDay': 19, u'TotalVisits': 2, u'TotalTimeSpent': 120.0}, {u'HourOfDay': 14, u'TotalVisits': 4, u'TotalTimeSpent': 413.0}, {u'HourOfDay': 11, u'TotalVisits': 4, u'TotalTimeSpent': 806.0}, {u'HourOfDay': 10, u'TotalVisits': 1, u'TotalTimeSpent': 26.0}, {u'HourOfDay': 0, u'TotalVisits': 1, u'TotalTimeSpent': 840.0}, {u'HourOfDay': 1, u'TotalVisits': 1, u'TotalTimeSpent': 46.0}, {u'HourOfDay': 23, u'TotalVisits': 1, u'TotalTimeSpent': 252.0}, {u'HourOfDay': 21, u'TotalVisits': 3, u'TotalTimeSpent': 99.0}, {u'HourOfDay': 7, u'TotalVisits': 1, u'TotalTimeSpent': 771.0}, {u'HourOfDay': 8, u'TotalVisits': 3, u'TotalTimeSpent': 44.0}, {u'HourOfDay': 9, u'TotalVisits': 4, u'TotalTimeSpent': 123.0}, {u'HourOfDay': 15, u'TotalVisits': 4, u'TotalTimeSpent': 661.0}, {u'HourOfDay': 12, u'TotalVisits': 3, u'TotalTimeSpent': 309.0}, {u'HourOfDay': 18, u'TotalVisits': 2, u'TotalTimeSpent': 77.0}, {u'HourOfDay': 13, u'TotalVisits': 3, u'TotalTimeSpent': 123.0}, {u'HourOfDay': 3, u'TotalVisits': 3, u'TotalTimeSpent': 5324.0}, {u'HourOfDay': 20, u'TotalVisits': 2, u'TotalTimeSpent': 45.0}], u'DayOfWeek': 7}]


I want to add the total time spent in these two lists and create a new list out of it i.e. if HourOfDay for the same DayOfWeek is the same then add the TotalTimeSpent else append the info to the Hours list.

I have tried this but it doesn't run!

for item1, item2 in zip(lst1,lst2):
for hr1 in item1["Hours"]:
for hr2 in item2["Hours"]:
if hr1["HourOfDay"] == hr2["HourOfDay"]:
hr1["TotalTimeSpent"] = hr1["TotalTimeSpent"] + hr2["TotalTimeSpent"]
else:
item1["Hours"].append(hr2)


What I want is the union of the two lists for every day and if the HourOfDay field is the same for the two sets on the same dayOfWeek, then add the totalvisits and totaltimespent.

Answer

I don't have enough rep to comment, so I'll write my concerns here.

First off, I'm guessing that the missing indent level on the second for loop is a typo and that it doesn't appear in your original code.

Secondly, I don't understand the meaning of your else statement.

What you are, essentially, doing, is creating an infinite loop.

What that means is that your code, for every HOUR of first list you scan the second list looking for a match.

If a match is found, you update hr1 (careless of the fact that it is basically a "cursor", so it's going to be fairly useless once out of the loop), otherwise you APPEND an element to the end of the ORIGINAL list?

Keep in mind that you're scanning the original list, not a copy.

I wrote this piece of code real quick, give it a try and let me know:

final = lst1[::]
broken = False
i = j = 0
for item1, item2 in zip(lst1,lst2):
    # Backup copy
    it2_hrs = item2["Hours"]
    # Scan first list
    for hr1 in item1["Hours"]:
        # Scan second list
        for index,hr2 in enumerate(item2["Hours"]):
            print "index: {0} ;; hr2: {1}".format(index,hr2)
            if hr1["HourOfDay"] == hr2["HourOfDay"]:
                final[i]["Hours"][j]["TotalTimeSpent"] = hr1["TotalTimeSpent"] + hr2["TotalTimeSpent"]
                it2_hrs.pop(index)
                j += 1
                broken = True
                break
        if broken == True:
            broken = False
            continue
    final[i]["Hours"].extend(it2_hrs)
    i += 1
    j = 0
    break

print final
Comments