Luis Miguel Luis Miguel - 8 months ago 35
Python Question

Pandas df to dictionary with values as python lists aggregated from a df column

I have a pandas df containing 'features' for stocks, which looks like this:

features for stocks previous to training neural net

I am now trying to create a dictionary with unique sector as key, and a python list of tickers for that unique sector as values, so I end up having something that looks like this:

{'consumer_discretionary': ['AAP',


I could iterate over the pandas df rows to create the dictionary, but I prefer a more pythonic solution. Thus far, this code is a partial solution:


Any feedback is appreciated.


The solution by @wrwrwr


partially works, but it returns a pandas series as a the value, instead of a python list. Any ideas about how to transform the pandas series into a python list in the same line and w/o having to iterate the dictionary?


Wouldn't f.set_index('ticker').groupby('sector').groups be what you want?

For example:

f = DataFrame({
        'ticker': ('t1', 't2', 't3'),
        'sector': ('sa', 'sb', 'sb'),
        'name': ('n1', 'n2', 'n3')})

groups = f.set_index('ticker').groupby('sector').groups
# {'sa': Index(['t1']), 'sb': Index(['t2', 't3'])}

To ensure that they have the type you want:

{k: list(v) for k, v in f.set_index('ticker').groupby('sector').groups.items()}


f.set_index('ticker').groupby('sector').apply(lambda g: list(g.index)).to_dict()