Gerrit - 1 year ago 91
Python Question

# Count occurences in DataFrame

I've a Dataframe in this format:

``````| Department | Person | Power  | ... |
|------------|--------|--------|-----|
| ABC        | 1234   |  75    | ... |
| ABC        | 1235   |  25    | ... |
| DEF        | 1236   |  50    | ... |
| DEF        | 1237   | 100    | ... |
| DEF        | 1238   |  25    | ... |
| DEF        | 1239   |  50    | ... |
``````

What I now want to get is the sum of occurences for each value in the power column. How can I get this from my DataFrame?

``````| Department | 100 |  75 |  50 |  25 |
|------------|-----|-----|-----|-----|
| ABC        |   0 |   1 |   0 |   1 |
| DEF        |   1 |   0 |   2 |   1 |
``````

You can use `value_counts` with `sort_index`, then generate `DataFrame` by `to_frame` and last transpose by `T`:

``````print (df.Power.value_counts().sort_index(ascending=False).to_frame().T)
100  75   50   25
Power    1    1    2    2
``````

EDIT by comment:

You need `crosstab`:

``````print (pd.crosstab(df.Department, df.Power).sort_index(axis=1, ascending=False))
Power       100  75   50   25
Department
ABC           0    1    0    1
DEF           1    0    2    1
``````

Faster another solution with `groupby` and `unstack`:

``````print (df.groupby(['Department','Power'])
.size()
.unstack(fill_value=0)
.sort_index(axis=1, ascending=False))

Power       100  75   50   25
Department
ABC           0    1    0    1
DEF           1    0    2    1
``````

If need `groupby` by columns `Department` and `Person`, add column `Person` to `groupby` to second position (thank you piRSquared):

``````print (df.groupby(['Department','Person', 'Power'])
.size()
.unstack(fill_value=0)
.sort_index(axis=1, ascending=False))

Power              100  75   50   25
Department Person
ABC        1234      0    1    0    0
1235      0    0    0    1
DEF        1236      0    0    1    0
1237      1    0    0    0
1238      0    0    0    1
1239      0    0    1    0
``````

EDIT1 by comment:

If need add another missing values, use `reindex`:

``````print (df.groupby(['Department','Power'])
.size()
.unstack(fill_value=0)
.reindex(columns=[100,75,50,25,0], fill_value=0))

Power       100  75   50   25   0
Department
ABC           0    1    0    1    0
DEF           1    0    2    1    0
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download