Andrea Ianni ௫ Andrea Ianni ௫ - 1 month ago 6
R Question

pylab: plotting points with colors and labels (IDs, not categories)

I'm trying to plot points with both colors and labels. This is not a classical problem: in fact, typically python users set "labels" as categories. In this case I want that the color represents a feature, while the label is an identifier for the point itself.
It follows a toy-example:

x = [-0.01611772, 1.51755901, -0.64869352, -1.80850313, -0.11505037]
y = [ 0.04845168, -0.45576903, 0.62703651, -0.24415787, -0.41307092]

colors = ['b', 'g', 'r', 'b', 'r']
labels = ['Gioele', 'Felix', 'Elpi', 'Roro', 'Cacara']


I'd like to use the function scatter. Following the "quick" documentation:

def scatter(x, y, s=20, c=None, marker='o', cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, hold=None, data=None, **kwargs) Inferred type: (x: Any, y: Any, s: int, c: Any, marker: unicode, cmap: Any, norm: Any, vmin: Any, vmax: Any, alpha: Any, linewidths: Any, verts: Any, edgecolors: Any, hold: Any, data: Any, kwargs: dict) -> Any


So, my try was:

import pylab
pylab.scatter(x, y, c=colors, data=labels)
pylab.show()


but it seems ignoring the
data=labels
part.

In addition: suppose we can plot the labels, is there a way to plot them in a "smart" way, i.e. such that the labels don't hide each other? I would need something similar to the R function
ggrepel
.

Answer

I think using plt.annotate is an option here. To take your example:

import matplotlib.pyplot as plt

x = [-0.01611772,  1.51755901, -0.64869352, -1.80850313, -0.11505037]
y = [ 0.04845168, -0.45576903,  0.62703651, -0.24415787, -0.41307092]
colors = ['b', 'g', 'r', 'b', 'r']
labels = ['Gioele', 'Felix', 'Elpi', 'Roro', 'Cacara']

plt.scatter(x,y,c=colors)
for label,xi,yi in zip(labels,x,y):
    plt.annotate(label,xy=(xi,yi),textcoords='offset points',
    ha='left',va='bottom')

This gives the following output:

enter image description here

Edit: I just spotted that you also asked about overlapping labels, too. This question seems to have a good solution. There is also apparently a piece of code on github that is designed to emulate ggrepel.