gtang31 gtang31 - 5 months ago 36
Python Question

How to groupby count across multiple columns in pandas

I have the following sample dataframe in Python pandas:

+---+------+------+------+
| | col1 | col2 | col3 |
+---+------+------+------+
| 0 | a | d | b |
+---+------+------+------+
| 1 | a | c | b |
+---+------+------+------+
| 2 | c | b | c |
+---+------+------+------+
| 3 | b | b | c |
+---+------+------+------+
| 4 | a | a | d |
+---+------+------+------+


I would like to perform a count of all the 'a,' 'b,' 'c,' and 'd' values across columns 1-3 so that I would end up with a dataframe like this:

+---+--------+-------+
| | letter | count |
+---+--------+-------+
| 0 | a | 4 |
+---+--------+-------+
| 1 | b | 5 |
+---+--------+-------+
| 2 | c | 4 |
+---+--------+-------+
| 3 | d | 2 |
+---+--------+-------+


One way I can do this is stack the columns on top of each other and THEN do a groupby count, but I feel like there has to be a better way. Can someone help me with this?

Answer

You can stack() the dataframe to put all columns into rows and then do value_counts:

df.stack().value_counts()

b    5
c    4
a    4
d    2
dtype: int64