John John - 3 months ago 7
Python Question

Concatenating Rows into a a Row in Pandas

I have a dataframe like the below- both columns are strings, with the ValCol being a string of comma separated integers. The index is a generic integer index with no meaning.

NameCol ValCol
Name1 555, 333
Name2 433
Name1 999
Name3 123
Name2 533


What's the best way to aggregate it to

NameCol ValCol
Name1 555, 333, 999
Name2 433, 533
Name3 123


T don't care about the order of the comma separated integers, but I do need to keep commas between them. It likely will be a very small dataframe, <100 records, so efficiency isn't critical.

I feel like there should be some groupby approach to this, but I haven't figured it out yet.

Answer

Using a groupby approach:

df = df.groupby('NameCol')['ValCol'].apply(', '.join).reset_index()

The resulting output:

  NameCol         ValCol
0   Name1  555, 333, 999
1   Name2       433, 533
2   Name3            123