Ben Ben - 2 months ago 8
Python Question

How to make a group id using pandas

R's

data.table
package has a really convenient
.GRP
method for generating group index values.

library(data.table)
dt <- data.table(
Grp=c("a", "z", "a", "f", "f"),
Val=c(3, 2, 1, 2, 2)
)
dt[, GrpIdx := .GRP, by=Grp]

Grp Val GrpIdx
1: a 3 1
2: z 2 2
3: a 1 1
4: f 2 3
5: f 2 3


What's the best way to accomplish the same thing using
pandas
?

import pandas as pd
df = pd.DataFrame({'Grp':["a", "z", "a", "f", "f"], 'Val':[3, 2, 1, 2, 2]})

Answer

You could use rank to identify unique groups with the method arg set to dense which accepts string values:

df['GrpIdx'] = df['Grp'].rank(method='dense').astype(int)

Image

Comments