EmilBB EmilBB - 11 months ago 42
R Question

Having two for-loops to create a list with (i,j) where the outer loop doesn't override the inner

I'm trying to get to grips with lapply and loops in R. I need to rename every column in a matrix, that has arbitrary column names, into meaningful ones. The result should be like this:

wages.seg.mean.1996 <- rnorm(4)
wages.seg.var.1996 <- rnorm(4)
wages.seg.sd.1996 <- rnorm(4)
wages.seg.min.1996 <- rnorm(4)
wages.seg.max.1996 <- rnorm(4)
wages.seg.total.1996 <- rnorm(4)
wages.seg.mean.1997 <- rnorm(4)
wages.seg.var.1997 <- rnorm(4)
wages.seg.sd.1997 <- rnorm(4)
wages.seg.min.1997 <- rnorm(4)
wages.seg.max.1997 <- rnorm(4)
wages.seg.total.1997 <- rnorm(4)
df <- data.frame(wages.seg.mean.1996,wages.seg.var.1996,wages.seg.sd.1996,wages.seg.min.1996,wages.seg.max.1996,wages.seg.total.1996,wages.seg.mean.1997,wages.seg.var.1997,wages.seg.sd.1997,wages.seg.min.1997,wages.seg.max.1997,wages.seg.total.1997)

I know how the variables are sorted, so I just need a list with all the new names in correct order, then I can do this:

colnames(df) <- wages.list[]
df <- tbl_df(df)

In my real data the year-period is 1996-2009, but I guess this example doesn't need more than two years.
I've tried like this, but because the second loop is nested, I just end up with the last year overwriting the others:

wages.seg.list <- NULL
wages.seg.list <- list()
for (i in 1996:1997) {
for (j in seq(1,12,6)) {
wages.seg.list[[j]] <- paste0("wages.seg.mean.",i)
wages.seg.list[[j+1]] <- paste0("wages.seg.var.",i)
wages.seg.list[[j+2]] <- paste0("wages.seg.sd.",i)
wages.seg.list[[j+3]] <- paste0("wages.seg.min.",i)
wages.seg.list[[j+4]] <- paste0("wages.seg.max.",i)
wages.seg.list[[j+5]] <- paste0("wages.seg.total.",i)

So you see how I try to vary position in the list and the name of the variable at the same time. I can do this by hand, of course, but I want to understand the logic of how to do something like this. If you could help me to get that, I would be grateful.


Answer Source

Does this do what you're looking for?

You said that you wanted to know more about how to do loops in R, but this does the opposite. It shows how to avoid loops, which is one of R's strengths. R's vectorization means that to efficiently use R you have to think a bit differently to other languages. You have to spend less effort telling the computer "how" to do things, and focus on "what" you want to achieve.

Edit: an even more vectorized solution.

yearly.name.template <- c(

# create the names for your list titles.
all.names.df <- merge(yearly.name.template, 1996:1997, all=TRUE)

# apply(x,1,FUN,...) works row-wise on the data frame or matrix.
# 'collapse' is needed to merge the entire line.
all.names <- apply(all.names.df, 1, paste0, collapse="")

# now create a list of blank entries, one entry per list title.
wages.seg.list <- vector("list", length(all.names))

# and push the names in.
names(wages.seg.list) <- all.names