Vivek Kalyanarangan - 5 months ago 53

Python Question

Here's the data:

`ID Type`

1 In

1 In

1 Out

1 In

2 Out

2 In

2 In

2 In

2 Out

Answer

Try this:

```
df.groupby('ID')['Type'].agg(lambda x: (x=='In').rolling(3).apply(lambda x: x.all()).max())
Out[34]:
ID
1 0.0
2 1.0
Name: Type, dtype: float64
```

It will return 1 for the groups that meet your criterion, 0 otherwise.

It first groups by ID and takes the Type column. For your example, it has two groups: `{1: ['In', 'In', 'Out', 'In'], 2: ['Out', 'In', 'In', 'In', 'Out']}`

. For each group (`x`

), it first creates a boolean series `x=='In'`

. The series are `[True, True, False, True]`

and `[False, True, True, True, False]`

. Now, on those series, it applies the rolling function. It takes three at a time and evaluates `x.all()`

. For the first group, the first three (`[True, True, False]`

) and the second three (`[True, False, True]`

) returns False because all three should be True. The maximum of these two False's is 0. For the second group, the rolling method will produce (`[False, True, True], [True, True, True], [True, True, False]`

) so for the second one `x.all()`

will be True and therefore the maximum will be 1.

`Series.rolling()`

was introduced in pandas 0.18 I believe. For earlier versions, you can use:

```
df.groupby('ID')['Type'].agg(lambda x: pd.rolling_apply(x=='In', 3, lambda x: x.all()).max())
```