Anna Anna - 2 months ago 25
R Question

Add horizontal lines in categorical scatter plot using ggplot2 in R

I am trying to plot a simple scatter plot for 3 groups, with different horizontal lines (line segment) for each group: for instance a hline at 3 for group "a", a hline at 2.5 for group "b" and a hline at 6 for group "c".

library(ggplot2)
df <- data.frame(tt = rep(c("a","b","c"),40),
val = round(rnorm(120, m = rep(c(4, 5, 7), each = 40))))
ggplot(df, aes(tt, val))+
geom_jitter(aes(tt, val), data = df, colour = I("red"),
position = position_jitter(width = 0.05))


I really appreciate your help!

Answer

Never send a line when a point can suffice:

library(ggplot2)

df <- data.frame(tt = rep(c("a","b","c"),40),
                 val = round(rnorm(120, m = rep(c(4, 5, 7), each = 40))))

hline <- data.frame(tt=c("a", "b", "c"), v=c(3, 2.5, 6))

ggplot(df, aes(tt, val))+
  geom_point(data=hline, aes(tt, v), shape=95, size=20) +
  geom_jitter(aes(tt, val), data = df, colour = I("red"), 
              position = position_jitter(width = 0.05))

enter image description here

There are other ways if this isn't acceptable, such as:

hline <- data.frame(tt=c(1, 2, 3), v=c(3, 2.5, 6))

ggplot(df, aes(tt, val))+
  geom_jitter(aes(tt, val), data = df, colour = I("red"), 
              position = position_jitter(width = 0.05)) +
  geom_segment(data=hline, aes(x=tt-0.25, xend=tt+0.25, y=v, yend=v))

enter image description here

The downside for the point is the egregious thickness and no control over width.

The downside for the segment is the need to use numerics for the discrete axis position vs the factors.

I also should have set the random seed to ensure reproducibility.