Ratchainant Thammasudjarit Ratchainant Thammasudjarit - 1 year ago 115
Python Question

unique combinations of values in selected columns in pandas data frame and count

I have my data in pandas data frame as follows:

df1 = pd.DataFrame({'A':['yes','yes','yes','yes','no','no','yes','yes','yes','no'],

So, my data looks like this

index A B
0 yes yes
1 yes no
2 yes no
3 yes no
4 no yes
5 no yes
6 yes no
7 yes yes
8 yes yes
9 no no

I would like to transform it to another data frame. The expected output can be shown in the following python script:

output = pd.DataFrame({'A':['no','no','yes','yes'],'B':['no','yes','no','yes'],'count':[1,2,4,3]})

So, my expected output looks like this

index A B count
0 no no 1
1 no yes 2
2 yes no 4
3 yes yes 3

Actually, I can achieve to find all combinations and count them by using the following command:
mytable = df1.groupby(['A','B']).size()

However, it turns out that such combinations are in a single column. I would like to separate each value in a combination into different column and also add one more column for the result of counting. Is it possible to do that? May I have your suggestions? Thank you in advance.

Answer Source

You can groupby on cols 'A' and 'B' and call size and then reset_index and rename the generated column:

In [26]:

     A    B  count
0   no   no      1
1   no  yes      2
2  yes   no      4
3  yes  yes      3