AMS - 5 months ago 27

Python Question

I am trying to tabulate a change in condition using a 'groupby' but am stumped and would appreciate any guidance. I have a data frame as follows:

`SUBJECT TYPE`

1 1

1 2

1 2

2 1

2 1

3 1

3 3

3 5

I would like to generate a statement that tabulates any positive change, ignores any negative change, and generates a count of change per subject. For example, the output of the above would be:

`Subject TYPE`

1 1

2 0

3 2

Would I need create an if/else clause using pandas, or is there a simpler way to achieve this using summit? Maybe something like...

`def tabchange(type, subject):`

current_subject = subject[0]

type_diff = type - type

j = 1

for i in range(1,len(type)):

type_diff[i] = type[i] - type[i-j]

if subject[i] == current_subject:

if type_diff[i] > 0:

new_row = 1

j += 1

else:

j = 1

else:

new_row[i] = 0

current_subject = subject[i]

return new_row

Answer

```
import pandas as pd
df = pd.DataFrame({'SUBJECT': [1, 1, 1, 2, 2, 3, 3, 3],
'TYPE': [1, 2, 2, 1, 1, 1, 3, 5]})
grouped = df.groupby('SUBJECT')
df['TYPE'] = grouped['TYPE'].diff() > 0
result = grouped['TYPE'].agg('sum')
```

yields

```
SUBJECT
1 1.0
2 0.0
3 2.0
Name: TYPE, dtype: float64
```

Above, `df`

is grouped by `SUBJECT`

and the diff is taken of the `TYPE`

column:

```
In [253]: grouped = df.groupby('SUBJECT'); df['TYPE'] = grouped['TYPE'].diff() > 0
In [254]: df
Out[254]:
SUBJECT TYPE
0 1 False
1 1 True
2 1 False
3 2 False
4 2 False
5 3 False
6 3 True
7 3 True
```

Then, again grouping by `SUBJECT`

, the result is obtained by counting the number of `True`

s in the `TYPE`

column:

```
In [255]: result = grouped['TYPE'].agg('sum'); result
Out[255]:
SUBJECT
1 1.0
2 0.0
3 2.0
Name: TYPE, dtype: float64
```