Stefano Potter Stefano Potter - 5 months ago 17
Python Question

Multiple conditional data adjustments

I have a df like this:

Species Status T1 T2 T3
0 Kentucky L 96.0 88.0 79.0
1 Eupesu L 0.0 2.0 3.0
2 Serrated L 1.0 4.0 2.0
3 Smooth_Brome L 2.0 0.0 1.0
4 U1 L 3.0 6.0 10.0
Total 102.0 100.0 95.0


I want to write a statement or function that says if the total for
T1
(which is 102.0) is less than 100 then take 100 - the total value within
T1
and add that value to the maximum value. If
T1
is greater than 100 then take the total value of
T1
and subtract it from 100 and subtract that value from the maximum within
T1
. I want to do the same thing with the total for
T2
and
T3
as well. Essentially the total for every column should be 100.

I have been trying to do it with this code:

if df.T1.Total < 100:
df.T1.max()=(100-df.T1.Total)+df.T1.max()
if df.T1.Total > 100:
df.T1.max()= df.T1.max()-(df.T1.Total-100)
if df.T2.Total < 100:
df.T2.max()=(100-df.T2.Total)+df.T2.max()
if df.T2.Total > 100:
df.T2.max()=df.T2.max()-(df.T2.Total-100)
if df.T3.Total < 100:
df.T3.max()=(100-df.T3.Total)+df.T3.max()
if df.T3.Total > 100:
df.T3.max()=df.T3.max()-(df.T3.Total-100)
df.loc['Total']=df.sum()
print df


and my desired output would be this:

Species Status T1 T2 T3
0 Kentucky L 94.0 88.0 84.0
1 Eupesu L 0.0 2.0 3.0
2 Serrated L 1.0 4.0 2.0
3 Smooth_Brome L 2.0 0.0 1.0
4 U1 L 3.0 6.0 10.0
Total 100.0 100.0 100.0


that code though returns:

SyntaxError: can't assign to function call


and I am pretty sure there is a better way to do this in the first place.

Answer

Here's how you can adjust the values:

df = df.drop('Total').set_index(['Species', 'Status']) # recreating 'Total' and leaving only numerical columns
difference = 100 - df.sum() # adjustment values
for col, index in df.idxmax().items(): # works also if max() occurs in different rows
    df.loc[index, col] += difference[col] # adjusting each max value

to get:

Species      Status                  
Kentucky     L       94.0  88.0  84.0
Eupesu       L        0.0   2.0   3.0
Serrated     L        1.0   4.0   2.0
Smooth_Brome L        2.0   0.0   1.0
U1           L        3.0   6.0  10.0

so that:

result = pd.concat([df.reset_index(), df.sum().to_frame().T]).fillna('')
result.index = result.index.tolist()[:-1] + ['Total']

yields:

            Species Status     T1     T2     T3
0          Kentucky      L   94.0   88.0   84.0
1            Eupesu      L    0.0    2.0    3.0
2          Serrated      L    1.0    4.0    2.0
3      Smooth_Brome      L    2.0    0.0    1.0
4                U1      L    3.0    6.0   10.0
Total                       100.0  100.0  100.0
Comments