zelenov aleksey - 1 year ago 96
Python Question

# change rows in pandas

i have a matrix in pandas data frame

``````print dfMatrix
0       1      2      3       4
0  10000      10      8     11      10
1     10  100000     13      9      10
2      8      13  10000      9      11
3     11       9      9  10000      12
4     10      10     11     12  100000
``````

I need to change row values by reducing each row value by minimum from that row(row by row)
here is the code i try:

``````def matrixReduction(matrix):
minRowValues = matrix.min(axis=1)
for i in xrange(matrix.shape[1]):
matrix[i][:] = matrix[i][:] - minRowValues[i]
return matrix
``````

and expect output like:

``````      0     1     2     3     4
0 9992     2     0     3     2
1    1 99991     4     0     1
2    0     5  9992     1     3
3    2     0     0  9991     3
4    0     0     1     2 99990
``````

but i get such output:

``````      0      1     2     3      4
0  9992      1     0     2      0
1     2  99991     5     0      0
2     0      4  9992     0      1
3     3      0     1  9991      2
4     2      1     3     3  99990
``````

So it changes values in columns instead of rows,
How do i achieve it for rows?
thx

You can subtract by `sub` minimal values per rows by `min`:

``````print (df.min(axis=1))
0     8
1     9
2     8
3     9
4    10
dtype: int64

print (df.sub(df.min(axis=1), axis=0))
0      1     2     3      4
0  9992      2     0     3      2
1     1  99991     4     0      1
2     0      5  9992     1      3
3     2      0     0  9991      3
4     0      0     1     2  99990
``````

I try also rewrite your function - I add `ix` for selecting:

``````def matrixReduction(matrix):
minRowValues = matrix.min(axis=1)
for i in range(matrix.shape[1]):
matrix.ix[i,:] = matrix.ix[i, :] - minRowValues[i]
return matrix
``````

Timings:

``````In [136]: %timeit (matrixReduction(df))
100 loops, best of 3: 2.64 ms per loop

In [137]: %timeit (df.sub(df.min(axis=1), axis=0))
The slowest run took 5.49 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 308 µs per loop
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download