TimRich TimRich - 4 months ago 53
Python Question

Set values on the diagonal of pandas.DataFrame

I have a pandas dataframe I would like to se the diagonal to 0

import numpy
import pandas

df = pandas.DataFrame(numpy.random.rand(5,5))
df

Out[6]:
0 1 2 3 4
0 0.536596 0.674319 0.032815 0.908086 0.215334
1 0.735022 0.954506 0.889162 0.711610 0.415118
2 0.119985 0.979056 0.901891 0.687829 0.947549
3 0.186921 0.899178 0.296294 0.521104 0.638924
4 0.354053 0.060022 0.275224 0.635054 0.075738
5 rows × 5 columns


now I want to set the diagonal to 0:

for i in range(len(df.index)):
for j in range(len(df.columns)):
if i==j:
df.loc[i,j] = 0
df
Out[9]:
0 1 2 3 4
0 0.000000 0.674319 0.032815 0.908086 0.215334
1 0.735022 0.000000 0.889162 0.711610 0.415118
2 0.119985 0.979056 0.000000 0.687829 0.947549
3 0.186921 0.899178 0.296294 0.000000 0.638924
4 0.354053 0.060022 0.275224 0.635054 0.000000
5 rows × 5 columns


but there must be a more pythonic way than that!?

Answer
In [21]: df.values[[np.arange(5)]*2] = 0

In [22]: df
Out[22]: 
          0         1         2         3         4
0  0.000000  0.931374  0.604412  0.863842  0.280339
1  0.531528  0.000000  0.641094  0.204686  0.997020
2  0.137725  0.037867  0.000000  0.983432  0.458053
3  0.594542  0.943542  0.826738  0.000000  0.753240
4  0.357736  0.689262  0.014773  0.446046  0.000000

Note that this will only work if df has the same number of rows as columns. Another way which will work for arbitrary shapes is to use np.fill_diagonal:

In [36]: np.fill_diagonal(df.values, 0)