Thomas Matthew - 1 month ago 13

Python Question

Given the below dataframe:

`import pandas as pd`

import numpy as np

a = np.arange(16).reshape(4, 4)

df = pd.DataFrame(data=a, columns=['a','b','c','d'])

I'd like to produce the following result:

`df([[ NaN, 1, 2, 3],`

[ NaN, NaN, 6, 7],

[ NaN, NaN, NaN, 11],

[ NaN, NaN, NaN, NaN]])

So far I've tried using

`np.tril_indicies`

`il1 = np.tril_indices(4)`

a[il1] = 0

gives:

`array([[ 0, 1, 2, 3],`

[ 0, 0, 6, 7],

[ 0, 0, 0, 11],

[ 0, 0, 0, 0]])

...which is almost what I'm looking for, but barfs at assigning NaN:

`ValueError: cannot convert float NaN to integer`

while:

`df[il1] = 0`

gives:

`TypeError: unhashable type: 'numpy.ndarray'`

So if I want to fill the bottom triangle of a dataframe with NaN, does it 1) have to be a numpy array, or can I do this with pandas directly? And 2) Is there a way to fill bottom triangle with NaN rather than using

`numpy.fill_diagonal`

Another failed solution:

Filling the diagonal of np array with zeros, then masking on zero and reassigning to np.nan. It converts zero values above the diagonal as NaN when they should be preserved as zero!

Answer

You need cast to `float`

`a`

, because `type`

of `NaN`

is `float`

:

```
import numpy as np
a = np.arange(16).reshape(4, 4).astype(float)
print (a)
[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]
[ 12. 13. 14. 15.]]
il1 = np.tril_indices(4)
a[il1] = np.nan
print (a)
[[ nan 1. 2. 3.]
[ nan nan 6. 7.]
[ nan nan nan 11.]
[ nan nan nan nan]]
df = pd.DataFrame(data=a, columns=['a','b','c','d'])
print (df)
a b c d
0 NaN 1.0 2.0 3.0
1 NaN NaN 6.0 7.0
2 NaN NaN NaN 11.0
3 NaN NaN NaN NaN
```