Andrea Ianni ௫ Andrea Ianni ௫ - 25 days ago 10
R Question

ggplot: how to choose the "proper" colors relating on a column

Suppose I have a simple dataframe to plot, in which I have to color the points related to the measure contained in a column. So, if I have:

dataframe
# X1 X2 pop
# 1 -0.11092652 -1.955598e-09 448053
# 2 -0.09999865 -2.310067e-10 418231
# 3 -0.05944755 -3.475013e-09 448473
# 4 0.51378848 1.631781e-09 119548
# 5 0.09438223 -9.606475e-10 323288
# 6 0.19349045 6.074025e-10 203153
# 7 0.06685609 3.210156e-10 208339
# 8 -0.10915456 -1.407190e-09 429178
# 9 -0.10348100 -1.401948e-09 1218038
# 10 -0.08607617 -7.356602e-10 383018
# 11 1.00343465 -2.423237e-08 209550
# 12 -0.05839148 1.503955e-09 287042
# 13 -0.09960163 2.167945e-10 973129
# 14 -0.05793417 2.510107e-09 187249
# 15 0.02191610 2.479708e-09 915225
# 16 0.48877872 1.338346e-08 462999
# 17 -0.10289556 1.472368e-09 1108776
# 18 -0.10316414 2.933469e-10 402422
# 19 -0.09545279 -2.926035e-10 274035
# 20 -0.06111044 3.464014e-09 230749


and I use ggplot in the following way:

ggplot(dataframe) +
ggtitle("Somehow useful spatialization")+ # Electricity / Gas
geom_point(aes(dataframe$X1, dataframe$X2), color = dataframe$pop, size=2 ) +
theme_classic(base_size = 16) +
guides(colour = guide_legend(override.aes = list(size=4)))+
xlab("X")+ylab("Y")


I obtain something like:
enter image description here

that is a possible representaion.
Neverthless, suppose that I want the points colored such to represent the column
pop
, i.e., having colors from (for example) light orange, passing for dark red and then black. How can I "scale" the column
pop
to obtain such graphics?

EDIT:

> dput(dataframe)
structure(list(X1 = c(-0.110926520419347, -0.0999986452719714,
-0.0594475526112884, 0.513788479303472, 0.0943822277852107, 0.193490454204271,
0.0668560854540437, -0.109154563987586, -0.103480996064617, -0.0860761723229372,
1.00343465471568, -0.0583914756527933, -0.0996016272609995, -0.0579341671474729,
0.0219161022704227, 0.488778719096658, -0.102895564162661, -0.103164140322136,
-0.0954527927249849, -0.0611104428640883), X2 = c(-1.9555978205951e-09,
-2.31006712207053e-10, -3.47501251356368e-09, 1.63178106438806e-09,
-9.60647459243156e-10, 6.07402512804044e-10, 3.21015629676789e-10,
-1.40718981687972e-09, -1.40194842954735e-09, -7.35660154466167e-10,
-2.423237202138e-08, 1.50395541775022e-09, 2.16794489937917e-10,
2.51010717100061e-09, 2.47970820013341e-09, 1.33834570208731e-08,
1.47236816671351e-09, 2.93346922578509e-10, -2.92603459149485e-10,
3.46401369936372e-09), pop = c(448053L, 418231L, 448473L, 119548L,
323288L, 203153L, 208339L, 429178L, 1218038L, 383018L, 209550L,
287042L, 973129L, 187249L, 915225L, 462999L, 1108776L, 402422L,
274035L, 230749L)), .Names = c("X1", "X2", "pop"), row.names = c(NA,
20L), class = "data.frame")

Answer

With ggplot you can add your aesthetics (aes) in your inital ggplot call. Since you're already telling ggplot where the data is (in dataframe), you can refer to the variables directly by their name (without dataframe$). Now for the color to be a scale it needs to be called as a aesthetic, inside the aes() call, and not as a static value. Once it is added as an aesthetic, we can customize how it reacts by adding a scale. Taking this all into account gives us the following code:

ggplot(dataframe, aes(x = X1, y = X2, color = pop)) +
  ggtitle("Somehow useful spatialization")+  # Electricity / Gas
  geom_point(size=2) +
  theme_classic(base_size = 16) +
  guides(colour = guide_legend(override.aes = list(size=4))) +
  xlab("X")+ylab("Y") +
  scale_color_gradient2(low = "green", mid = "red", high = "black", midpoint = mean(dataframe$pop))

This code gives the following graph. The colors could be further adjusted by playing around with the scale_color_gradient2 part. (Why green as low gives a better orange than actually choosing orange as the low color is beyond me, I just ended up there by coincidence)

The resulting graph

Comments