JDS JDS - 3 months ago 17
R Question

Adding color to multiple regression lines in a doubly nested lapply

I'm trying to make a plot of multiple regression lines on each single plot in a panel. So far, I have it that I can get the title of each panel and regression lines in each panel. However, I am having some trouble adding color to the regression lines.

set.seed(1)
abc.df <- data.frame(col1 = rep(c("a", "b", "c"), 1000), col2 = rep(1:4, 750),
col3 = rnorm(3000), col4 = rnorm(3000, 2))
names(abc.df) <- c("factor1", "factor2", "q", "value")
abc.df$factor1 <- as.factor(abc.df$factor1)
abc.df$factor2 <- as.factor(abc.df$factor2)
abc_list <- split(abc.df, abc.df$factor1)
namelist <- names(abc_list)
colorlist <- c("red", "green", "blue", "orange")
par(mfrow = c(1, 3))
lapply(names(abc_list), function(x) {
plot(abc_list[[x]]$q, abc_list[[x]]$value, pch = 20,
col = adjustcolor(colorlist, alpha = 0.3),
xlab = "q", ylab = "Value", main = x);
newsplit <- split(abc_list[[x]], abc_list[[x]]$factor2);
lapply(names(newsplit), function(y){
abline(lm(value ~ q, newsplit[[y]], col = colorlist[y]))
})})


However, I get warnings saying that the code doesn't know how to handle the col argument, so it ignores it. How can I fix my logic?

And yes, I know I could use ggplot2, but I'm trying to figure it out in base graphics.

Answer

I could somehow figure out that y in your last line is character, and thus does not work as index. An easy fix is to convert it to integer.

set.seed(1)
abc.df <- data.frame(col1 = rep(c("a", "b", "c"), 1000), col2 = rep(1:4, 750),
                     col3 = rnorm(3000), col4 = rnorm(3000, 2))
names(abc.df) <- c("factor1", "factor2", "q", "value")
abc.df$factor1 <- as.factor(abc.df$factor1)
abc.df$factor2 <- as.factor(abc.df$factor2)
abc_list <- split(abc.df, abc.df$factor1)
namelist <- names(abc_list)
colorlist <- c("red", "green", "blue", "orange") 
par(mfrow = c(1, 3))
lapply(names(abc_list), function(x) {
  plot(abc_list[[x]]$q, abc_list[[x]]$value, pch = 20, 
       col = adjustcolor(colorlist, alpha = 0.3),
       xlab = "q", ylab = "Value", main = x);
  newsplit <- split(abc_list[[x]], abc_list[[x]]$factor2);
  lapply(names(newsplit), function(y){
    abline(lm(value ~ q, newsplit[[y]]), col = colorlist[as.integer(y)])
  })})