S Ringne S Ringne - 17 days ago 5
Python Question

Split Column containing 2 values into different column in pandas df

i have a table in pandas df

bigram frequency
(123,3245) 2
(676,35346) 84
(93,32) 9


and so on, till 50 rows.

what i am looking for is, split the bigram column into two different columns removing the brackets and comma like,

col1 col2 frequency
123 3245 2
676 35346 84
93 32 9


is there any way to split if after comma,and removing brackets.

Answer

If your bigram column happens to be string format, you can use .str.extract() method with regex to extract numbers from it:

pd.concat([df.bigram.str.extract('(?P<col1>\d+),(?P<col2>\d+)'), df.frequency], axis = 1)

enter image description here

Or if the bigram column is of tuple type:

Method1: use pd.Series to create columns from the tuple:

pd.concat([df.bigram.apply(lambda x: pd.Series(x, index=['col1', 'col2'])), 
           df.frequency], axis=1)

Method2: use .str to get the first and second element from the tuple

df['col1'], df['col2'] = df.bigram.str[0], df.bigram.str[1]
df = df.drop('bigram', axis=1)
Comments