spiff spiff - 1 month ago 8
Python Question

Python data Manipulation: Splitting Data from one column to make more rows in the same dataframe

So my input is:

Col1 Col2 Col3 Col4
0 123 abc,def,ghi Country1 XXX
1 456 pqr,stu Country2 XX
2 789 xyz Country2 YY


I want my output as :

Col1 Col2 Col3 Col4
0 abc 123 Country1 XXX
1 def 123 Country1 XXX
2 ghi 123 Country1 XXX
3 pqr 456 Country2 XX
4 stu 456 Country2 XX
5 xyz 789 Country2 YY


What would be the most pythonic way to do this pls? Thanks vm!

Answer

You can use str.split with stack for creating Series for join to original DataFrame:

print (df.Col2
      .str
      .split(',',expand=True)
      .stack()
      .reset_index(drop=True, level=1)
      .rename('Col2'))

0    abc
0    def
0    ghi
1    pqr
1    stu
2    xyz
Name: Col2, dtype: object


print (df.drop('Col2', axis=1)
             .join
             (
             df.Col2
             .str
             .split(',',expand=True)
             .stack()
             .reset_index(drop=True, level=1)
             .rename('Col2')           
             ))

   Col1      Col3 Col4 Col2
0   123  Country1  XXX  abc
0   123  Country1  XXX  def
0   123  Country1  XXX  ghi
1   456  Country2   XX  pqr
1   456  Country2   XX  stu
2   789  Country2   YY  xyz