atclaus atclaus - 3 months ago 9
R Question

R - Mapply Functionality Creating and PDFing Plots/ggplots

I am fairly new to R and hoping someone can explain 2 things to me in the code below.


  1. Why do I need double brackets
    {{
    around the plot to get it to recordPlot so I can replay it? And then I need double square brackets
    [[
    in the replayPlot.

  2. Why can I not use the
    $
    notation inside mapply? It works outside of it. Is it bad to use $ in "proper" R work?



My real code is much larger so think it best to get mapply to work.

library(ggplot2)
library(gridExtra)

TDSF <- data.frame(Graduation=sample(1950:2010, 30,replace=TRUE),
Donation=sample(10:50000, 30,replace=TRUE),
Start.Year=sample(1950:2010,30,replace=TRUE),
State=sample(state.abb,30,replace=TRUE))
TDSF$Graduation <- as.numeric(as.character(TDSF$Graduation))
TDSF$Start <- as.numeric(as.character(TDSF$Start))

plots2 <- mapply(function(nm,df.year,df.bracket_5,df.bracket_10) list(
{{plot(-1:1,-1:1,type="n",xaxt="n",yaxt="n",ann=FALSE)+
text(0,0,paste("Analysis by",nm,"Year"),cex=2)
recordPlot()}},
{ggplot(data=TDSF, aes_(x=as.name(nm))) + geom_histogram(color="red",binwidth = 1,boundary=-.01)},
{ggplot(data=TDSF, aes_(x=as.name(nm))) + geom_histogram(color="red",binwidth = 5,boundary=-.01)}
),c("Graduation","Start"),SIMPLIFY = FALSE)

replayPlot(plots2$Graduation[[1]]) #use $ notation
do.call(grid.arrange,plots2$Graduation[2:3])`#use $ notation

mapply(function(nm)
{pdf(file=paste(nm,"test.pdf"))
replayPlot(plots2[[nm]][[1]]) #use [[]][[]]
do.call(grid.arrange,c(plots2[[nm]][2:3],ncol=1)) #use [[]][[]]
dev.off()}
,c("Graduation","Start"))

Answer

Let me reformat your code a bit :

library(ggplot2)
library(gridExtra)
TDSF <- data.frame(Graduation=sample(1950:2010, 30,replace=TRUE),
                   Donation=sample(10:50000, 30,replace=TRUE),
                   Start.Year=sample(1950:2010,30,replace=TRUE),
                   State=sample(state.abb,30,replace=TRUE))
TDSF$Graduation <- as.numeric(as.character(TDSF$Graduation))
TDSF$Start <- as.numeric(as.character(TDSF$Start))

plots2 <- mapply(function(nm,df.year,df.bracket_5,df.bracket_10) list(
  {plot(-1:1,-1:1,type="n",xaxt="n",yaxt="n",ann=FALSE)+
      text(0,0,paste("Analysis by",nm,"Year"),cex=2)
    recordPlot()},
  {ggplot(data=TDSF, aes_(x=as.name(nm))) + geom_histogram(color="red",binwidth = 1,boundary=-.01)},
  {ggplot(data=TDSF, aes_(x=as.name(nm))) + geom_histogram(color="red",binwidth = 5,boundary=-.01)}
),c("Graduation","Start"),SIMPLIFY = FALSE)

plots2

replayPlot(plots2$Graduation[[1]])  #use $ notation
n <- length(plots2)
nCol <- floor(sqrt(n))
do.call("grid.arrange",c(plots2$Graduation[2:3], ncol=nCol)) #use $ notation


replay <- function(nm)  {
  pdf(file = paste(nm,"test.pdf"))
  replayPlot(plots2[[nm]][[1]]) #use [[]][[]]
  do.call(grid.arrange,c(plots2[[nm]][2:3],ncol = 1)) #use [[]][[]]
  dev.off()
}

mapply(replay ,c("Graduation","Start"))
  1. You do not need the double brackets

The double square brackets are used to sequentially access an element. plots2 is a relatively complex structure and there are many ways to access its elements

   plots2$Graduation[[1]] is equivalent to plots2[[1]][[1]] for example
  1. $ is not related to mapply but you can see in the function replay (I've added) that it is due to the R interpretation of the nm variable at runtime. nm is a variable and you cannot use plots$nm (you would have to use some eval function to let R understand the expression you really mean)