Sanger99 Sanger99 - 3 months ago 25
R Question

Adding legends to multiple line plots with ggplot

I'm trying to add a legend to a plot that I've created using ggplot. I load the data in from two csv files, each of which has two columns of 8 rows (not including the header).

I construct a data frame from each file which include a cumulative total, so the dataframe has three columns of data (

bv
,
bin_count
and
bin_cumulative
), 8 rows in each column and every value is an integer.

The two data sets are then plotted as follows. The display is fine but I can't figure out how to add a legend to the resulting plot as it seems the ggplot object itself should have a data source but I'm not sure how to build one where there are multiple columns with the same name.

library(ggplot2)

i2d <- data.frame(bv=c(0,1,2,3,4,5,6,7), bin_count=c(0,0,0,2,1,2,2,3), bin_cumulative=cumsum(c(0,0,0,2,1,2,2,3)))
i1d <- data.frame(bv=c(0,1,2,3,4,5,6,7), bin_count=c(0,1,1,2,3,2,0,1), bin_cumulative=cumsum(c(0,1,1,2,3,2,0,1)))


c_data_plot <- ggplot() +
geom_line(data = i1d, aes(x=i1d$bv, y=i1d$bin_cumulative), size=2, color="turquoise") +
geom_point(data = i1d, aes(x=i1d$bv, y=i1d$bin_cumulative), color="royalblue1", size=3) +
geom_line(data = i2d, aes(x=i2d$bv, y=i2d$bin_cumulative), size=2, color="tan1") +
geom_point(data = i2d, aes(x=i2d$bv, y=i2d$bin_cumulative), color="royalblue3", size=3) +
scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
scale_y_continuous(name="Count", breaks=seq(0,12,1)) +
ggtitle("Combine plot of BV cumulative counts")

c_data_plot


I'm fairly new to R and would much appreciate any help.

Per comments, I've edited the code to reproduce the dataset after it's loaded into the dataframes.

Regarding producing a single data frames, I'd welcome advice on how to achieve that - I'm still struggling with how data frames work.

Answer

First, we organize the data by combining the i1d and i2d. I've added a column data which stores the name of the original dataset.

restructure data

i1d$data <- 'i1d'
i2d$data <- 'i2d'
i12d <- rbind.data.frame(i1d, i2d)

Then, we create the plot, using syntax that is more common to ggplot2:

create plot

ggplot(i12d, aes(x = bv, y = bin_cumulative))+
    geom_line(aes(colour = data), size = 2)+
    geom_point(colour = 'royalblue', size = 3)+
    scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
    scale_y_continuous(name="Count", breaks=seq(0,12,1)) + 
    ggtitle("Combine plot of BV cumulative counts")+
    theme_bw()

If we specify x and y within the ggplot function, we do not need to keep rewriting it in the various geoms we want to add to the plot. After the first three lines I copied and pasted what you had so that the formatting would match your expectation. I also added theme_bw, because I think it's more visually appealing. We also specify colour in aes using a variable (data) from our data.frame

enter image description here

If we want to take this a step further, we can use the scale_colour_manual function to specify the colors attributed to the different values of the data column in the data.frame i12d:

ggplot(i12d, aes(x = bv, y = bin_cumulative))+
    geom_line(aes(colour = data), size = 2)+
    geom_point(colour = 'royalblue', size = 3)+
    scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
    scale_y_continuous(name="Count", breaks=seq(0,12,1)) + 
    ggtitle("Combine plot of BV cumulative counts")+
    theme_bw()+
    scale_colour_manual(values = c('i1d' = 'turquoise',
                                   'i2d' = 'tan1'))

enter image description here