Luis Miguel Luis Miguel - 1 month ago 5
Python Question

Pandas df to dictionary with values as python lists aggregated from a df column

I have a pandas df containing 'features' for stocks, which looks like this:

features for stocks previous to training neural net

I am now trying to create a dictionary with unique sector as key, and a python list of tickers for that unique sector as values, so I end up having something that looks like this:

{'consumer_discretionary': ['AAP',
'AMZN',
'AN',
'AZO',
'BBBY',
'BBY',
'BWA',
'KMX',
'CCL',
'CBS',
'CHTR',
'CMG',


etc.

I could iterate over the pandas df rows to create the dictionary, but I prefer a more pythonic solution. Thus far, this code is a partial solution:

df.set_index('sector')['ticker'].to_dict()


Any feedback is appreciated.

UPDATE:

The solution by @wrwrwr

df.set_index('ticker').groupby('sector').groups


partially works, but it returns a pandas series as a the value, instead of a python list. Any ideas about how to transform the pandas series into a python list in the same line and w/o having to iterate the dictionary?

Answer

Wouldn't f.set_index('ticker').groupby('sector').groups be what you want?

For example:

f = DataFrame({
        'ticker': ('t1', 't2', 't3'),
        'sector': ('sa', 'sb', 'sb'),
        'name': ('n1', 'n2', 'n3')})

groups = f.set_index('ticker').groupby('sector').groups
# {'sa': Index(['t1']), 'sb': Index(['t2', 't3'])}

To ensure that they have the type you want:

{k: list(v) for k, v in f.set_index('ticker').groupby('sector').groups.items()}

or:

f.set_index('ticker').groupby('sector').apply(lambda g: list(g.index)).to_dict()
Comments