Tom - 1 year ago 59
Python Question

# How can I calculate values specific to an ID provided on each line?

I have a data table of values that look like this:

``````Score ID
1.2    1
2.4    1
1.1    1
1.9    1
2.4    2
3.5    2
2.2    2
1.1    3
1.1    3
1.7    3
3.1    3
2.2    3
``````

And what I really need is to calculate the Change In Score for each ID in a separate column, such that it looks like this:

``````Score   ID  Changes
1.2     1
2.4     1   1.2
1.1     1   -1.3
1.9     1   0.8
2.4     2
3.5     2   1.1
2.2     2   -1.3
1.1     3
1.1     3   0
1.7     3   0.6
3.1     3   1.4
2.2     3   -0.9
``````

Notice how each "changes" is calculated by taking the previous score, and subtracting it from the current score: e.g. in the first changes, you get 1.2 by taking 1.2 = 2.4(current score) - 1.2(The previous score)

The problem is that these scores must only pertain to only the scores within the confines of the ID. You can't just iterate over and take the difference of scores. How would I do this? What logic can I use to test to contain "changes" within the same ID? I'm familiar with R or python (or BASH) to do this.

In Python using Pandas:

``````import pandas as pd

df = pd.DataFrame(
{'ID': [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3],
'Score': [1.2, 2.4, 1.1, 1.9, 2.4, 3.5, 2.2, 1.1, 1.1, 1.7, 3.1, 2.2]})

df['Changes'] = df.groupby('ID').Score.transform(lambda group: group.diff())

>>> df
Score  ID  Changes
0     1.2   1      NaN
1     2.4   1      1.2
2     1.1   1     -1.3
3     1.9   1      0.8
4     2.4   2      NaN
5     3.5   2      1.1
6     2.2   2     -1.3
7     1.1   3      NaN
8     1.1   3      0.0
9     1.7   3      0.6
10    3.1   3      1.4
11    2.2   3     -0.9
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download