agf1997 agf1997 - 1 month ago 15
Python Question

Adding a column with a count

I have a pandas dataframe that looks like this :

d = {'device' : ['D1', 'D1', 'D1', 'D1', 'D1', 'D2', 'D2', 'D2', 'D2', 'D2',],
'interval' : ['Day 1', 'Day 1', 'Day 1', 'Day 2', 'Day 2', 'Day 2', 'Day 3', 'Day 3', 'Day 3', 'Day 3'],
'data' : np.random.rand(10) }

df = pd.DataFrame(d)
df.set_index(['device','interval'], inplace=True)
print(df)

data
device interval
D1 Day 1 0.087297
Day 1 0.849820
Day 1 0.577146
Day 2 0.027389
Day 2 0.469095
D2 Day 2 0.685370
Day 3 0.988979
Day 3 0.092372
Day 3 0.578315
Day 3 0.052130


I'd like to add a column that counts the number of inverval occurances per device. For example, I would expect the result to look like this.

data count
device interval
D1 Day 1 0.087297 1
Day 1 0.849820 2
Day 1 0.577146 3
Day 2 0.027389 1
Day 2 0.469095 2
D2 Day 2 0.685370 1
Day 3 0.988979 1
Day 3 0.092372 2
Day 3 0.578315 3
Day 3 0.052130 4

Answer

You can add a count column by grouping on the index levels and using transform to apply a lambda that calls rank on each grouping with params method='dense' andascending=False`:

In [15]:
df['count'] = df.groupby(level=[0,1]).transform(lambda x: x.rank(method='dense', ascending=False))
df

Out[15]:
                     data  count
device interval                 
D1     Day 1     0.442826      1
       Day 1     0.307261      2
       Day 1     0.040172      3
       Day 2     0.489025      1
       Day 2     0.210805      2
D2     Day 2     0.335703      1
       Day 3     0.659730      1
       Day 3     0.658278      2
       Day 3     0.296398      3
       Day 3     0.218484      4