chattrat423 chattrat423 - 3 months ago 10
Python Question

python dictionary iteration for data frame filtering with multiple conditions

I am basically trying to build basic search engine that returns results based on a parsed query.

I have a dictionary that is user generated based on parsed input from their string:

input = {“color”: [“black”], “make”: [“honda”], “type”: [“”]}


I am then trying to use that input, to do a search and filter of a dataset (which I am currently storing as a pandas dataframe, so please advise if this is also not optimal).

list(df.column.values) = make,type,color,mpg,year

honda,coupe,red,32,2014
bmw,suv,black,21,2012
honda,suv,black,24,2015
vw,sedan,black,31,2016


I need to iterate over the valid values of my input dictionary (notice that ‘type’ doesn’t have a value) and filter based on what the user entered in, ‘color’ and ‘make’). Sometimes they might include the type and leave out the color, etc. so I might never have a value for every key in my dictionary;

Sudo code:

for each valid value in my input dictionary:
filter df by appropriate_column=appropriate_value


So given my input example, I would filter my df down to only entries that were ‘black’ and made by ‘honda’.

Answer

Let d be your dict, then:

cond = [df[k].apply(lambda k: k in v if v != [''] else True) for k, v in d.items()]
cond_total = functools.reduce(lambda x, y: x & y, cond)
print(df[cond_total])

Output:

    make type  color  mpg  year
2  honda  suv  black   24  2015