Pavel Pavel - 2 months ago 29
R Question

Pandas scatter_matrix analog function to pairs(lower.panel, upper.panel)

I need to create a scatter matrix in Python. I tried using scatter_matrix for this but I would like to leave only the scatter plots above the diagonal line.

I`m in the really beginning (did not got far) and I have troubles when columns have names (not the default numbers).

Here is my code:

import itertools
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

data=pd.DataFrame(np.random.randint(0,100,size=(10, 5)), columns=list('ABCDE')) #THE PROBLEM IS HERE - I WILL HAVE COLUMNS WITH NAMES

d = data.shape[1]

fig, axes = plt.subplots(nrows=d, ncols=d, sharex=True, sharey=True)
for i in range(d):
for j in range(d):
ax = axes[i,j]
if i == j:
ax.text(0.5, 0.5, "Diagonal", transform=ax.transAxes,
horizontalalignment='center', verticalalignment='center',
fontsize=16)
else:
ax.scatter(data[j], data[i], s=10)

Answer

You have an issue when selecting a column from a data frame. You can use iloc to select columns based on integer location. Change your last line to:

ax.scatter(data.iloc[:,j], data.iloc[:,i], s=10)

Gives:

enter image description here