RAS RAS - 2 months ago 14
R Question

Plot each row of data frame as separate graph in R

I have a data frame in this format:

row.names 100 50 25 0
metabolite1 113417.2998 62594.7067 39460.7705 1.223243e+02
metabolite2 3494058.7972 2046871.7446 1261278.2476 6.422864e+03


The columns refer to the concentrations of quality controls (%): 100, 50, 25, 0.

Currently to plot a single graph I am extracting the data into a new data frame and plotting it like this:

metabolite1 <- data.frame(Numbers = c(100,50,25,0), Signal = c(113417.2998,62594.7067,39460.7705,122.3243))
# Extract coefficient of variance for line of best fit
Coef <- coef(lm(Signal ~ Numbers, data = metabolite1))
# plot data
ggplot(metabolite1, aes(x = Numbers, y = Signal)) +
geom_point() +
xlim(0,100) +
geom_abline(intercept = Coef[1], slope = Coef[2])


This is extremely inefficient and I am trying to find a better way to plot separate scatter plots for each row rather than creating separate data frames. What would be a better way to do this? I have 160 metabolites I need to produce graphs for. I have attempted the melt the data frame into the format:

Name variable value
metabolite1 100 113417.2998
metabolite2 100 3494058.7972
metabolite1 50 62594.7067
metabolite2 50 2046871.7446
metabolite1 25 39460.7705
metabolite2 25 1261278.2476
metabolite1 0 1.223243e+02
metabolite2 0 6.422864e+03


and then use ggplot and faceting to plot the data

ggplot(data = df, aes(x = variable, y = value)) +
geom_point() + facet_grid(~ Name)


but the plots produced all have the same y axis scale which is not appropriate for the data I am working with. I'm assuming because of this I cannot use faceting to produce the plots.

EDIT: I do not know how to add separate lines of best fit to each plot without using geom_smooth, which I do not wish to do.

Answer

You're on the right track with your method of melting and faceting:

ggplot(data = df, aes(x = variable, y = value)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, lwd = .5, col = "black") +
  facet_wrap(~ Name, scales = "free_y") 

enter image description here

This yields similar plots as those you get from running ggplot on subsets:

out <- lapply(list(metabolite1, metabolite2), function(d) {
  Coef <- coef(lm(Signal ~ Numbers, data = d))
  # plot data
  p <- ggplot(d, aes(x = Numbers, y = Signal)) +
    geom_point() + 
    xlim(0,100) +
    geom_abline(intercept = Coef[1], slope = Coef[2]) 
})
gridExtra::grid.arrange(out[[1]], out[[2]], nrow = 1)

enter image description here

Comments