Fizi - 1 year ago 83

Python Question

I have a list such that

`l = ['xyz','abc','mnq','qpr']`

These values are weighted such that

`xyz>abc>mnq>qpr`

I have a pandas dataframe with a column that has sets of values.

`COL_NAME`

0 set(['xyz', 'abc'])

1 set(['xyz'])

2 set(['mnq','qpr'])

Now, I want to pick the highest values in the sets such that after I apply the custom function I am left with

`COL_NAME`

0 set(['xyz'])

1 set(['xyz'])

2 set(['mnq'])

Is there an elegant way to do this process without resorting to a dictionary of weights?

Answer Source

you can use `pd.Categorical`

with the parameter `ordered=True`

and set the `categories=l[::-1]`

to get the order you'd like.

```
def max_cat(x):
return set([pd.Categorical(x, l[::-1], True).max()])
df.COL_NAME.apply(max_cat)
0 {xyz}
1 {xyz}
2 {mnq}
Name: COL_NAME, dtype: object
```