alfheim alfheim - 1 month ago 11
Python Question

python pandas use map with regular expressions

I have a dict:

dealer = {
'ESSELUNGA': 'Spesa',
'DECATHLON 00000120': 'Sport',
'LEROY MERLIN': 'Casa',
'CONAD 8429': 'Spesa',
'IKEA': 'Casa',
'F.LLI MADAFFARI': 'Spesa',
'SUPERMERCATO IL GIGANT': 'Spesa',
'NATURASI SPA': 'Spesa',
'ESSELUNGA SETTIMO MILANE': 'Spesa'
}


and I want to map it to a pandas df:

entries.Categoria = entries.Commerciante.map(dealer)


Is there a way to use regex to match map on "Commerciante" column? In this way I can rewrite dealer as this:

dealer = {
'ESSELUNGA': 'Spesa',
'DECATHLON': 'Sport',
'LEROY MERLIN': 'Casa',
'CONAD': 'Spesa',
'IKEA': 'Casa',
'F.LLI MADAFFARI': 'Spesa',
'SUPERMERCATO IL GIGANT': 'Spesa',
'NATURASI SPA': 'Spesa',
'ESSELUNGA SETTIMO MILANE': 'Spesa'
}


and match both "DECATHLON" and "DECATHLON 00000120"

Answer

Thank you to all of you. I used your suggestions to resolve my problem. I defined a new function:

def dealer_replace(dealer_dict, text):

    regex = re.compile("(%s)" % "|".join(map(re.escape, dealer_dict.keys())))

    if regex.search(text):
        ret = regex.search(text)
        return dealer_dict[ret.group()]
    else:
        return None

And use it with apply

entries['Categoria'] = entries['Commerciante'].apply(lambda v: dealer_replace(dealer, str(v)))
Comments