Perlinn Perlinn - 1 month ago 6
Python Question

Dictionary removing duplicate along with subtraction and addition of values

current output :
https://i.imgur.com/Jd5YhOw.jpg

wanted output: https://i.imgur.com/STxYE6X.jpg

So much loop within a loop this is so confusinggggggggg

writer.writerow([items,[quantities[x] for x in items],[prices[x] for x in items] ])


is messed up i dont know where to put this

Has it to do with


has it to do with d.clear() ? such that it would remove the last entry else it would keep on stacking the previous
d.clear()


updated my code

import csv
from itertools import groupby
from operator import itemgetter
import re
from collections import defaultdict

d = {}

#open directory and saving directory
with open("sample_data.csv", "rb") as f, open("out.csv", "wb") as out:
reader = csv.reader(f)
next(reader)
writer = csv.writer(out)
#the first column header
writer.writerow(["item","quantity","amount"])
groups = groupby(csv.reader(f), key=itemgetter(0))


for k, v in groups:
v = list(v)

#selecting and slicing transaction; It will be the master data used for TOTAL,items,amount and cost
transaction = [ x[1] for x in v[8:] ]
textwordfortransaction = str(transaction)

#using re.findall instead of re.search to return all via regex for items
itemprinter = re.findall(r"(?<=\s\s)\w+(?:\s\w+)*(?=\s\s)",textwordfortransaction)

#using re.findall instead of re.search to return all via regex for amount aka quantity
amountprinter = re.findall(r"'(-?\d+)\s+(?:[A-Za-z ]*)",textwordfortransaction)

#using re.findall instead of re.search to return all via regex for cost

costprinter = re.findall(r"(?:'-?\d+[A-Za-z ]*)(-?\d+[.]?\d*)",textwordfortransaction)

d[tuple(itemprinter)] = tuple(amountprinter),tuple(costprinter)

prices, quantities = defaultdict(int), defaultdict(int)
for key, val in d.items():
prices, quantities = defaultdict(int), defaultdict(int)
for item, quant, price in zip(key, *val):
quantities[item] += int(quant)
prices[item] += float(price)
items = list(prices)
writer.writerow([items,[quantities[x] for x in items],[prices[x] for x in items] ])

Answer

Working with your current output as posted in the question, you can just zip the different lists of tuples of items and quantities and prices to align the items with each other, add them up in two defaultdicts, and finally combine those to the result.

output = {('GRILLED AUSTRALIA ANGU',): (('1',), ('29.00',)), ...}

from collections import defaultdict
prices, quantities = defaultdict(int), defaultdict(int)
for key, val in output.items():
    for item, quant, price in zip(key, *val):
        quantities[item] += int(quant)
        prices[item] += float(price)

result = {item: (quantities[item], prices[item]) for item in prices}

Afterwards, result is this: Note that you do not need a special case for subtracting duplicates when the quantity and/or price are negative; just add the negative number.

{'ESCARGOT WITH GARLIC H': (1, 12.0), 
 'BRAISED BEANCURD WITH': (1, 10.0), 
 'CRISPY CHICKEN WINGS': (1, 7.0), 
 'SAUSAGE WRAPPED WITH B': (1, 10.0), 
 'ONION RINGS': (1, 6.0), 
 'PAN SEARED FOIE GRAS': (1, 15.0), 
 'Beer': (31, 93.0), 
 'Chocolate Cake': (3, 10.5), 
 'SAUTE FIELD MUSHROOM W': (1, 9.0), 
 'Carrot Cake': (4, 10.0), 
 'GRILLED AUSTRALIA ANGU': (1, 29.0)}

If you want to keep the individual items separate, just move the declaration of prices, quantities, and result inside the outer loop:

for key, val in output.items():
    prices, quantities = defaultdict(int), defaultdict(int)
    for item, quant, price in zip(key, *val):
        quantities[item] += int(quant)
        prices[item] += float(price)
    result = {item: (quantities[item], prices[item]) for item in prices}
    # do something with result or collect in a list

Example result for the two-beer line:

('Beer', 'Beer', 'Carrot Cake', 'Chocolate Cake') (('-1', '10', '1', '1'), ('-3.00', '30.00', '2.50', '3.50'))
{'Chocolate Cake': (1, 3.5), 'Beer': (9, 27.0), 'Carrot Cake': (1, 2.5)}

If you prefer the result to group the items, quantities and prices together, use this:

items = list(prices)
result = (items, [quantities[x] for x in items], [prices[x] for x in items])

Result is this like this:

(['Carrot Cake', 'Beer', 'Chocolate Cake'], [1, 9, 1], [2.5, 27.0, 3.5])