Paul van Oppen - 5 months ago 35
R Question

# Iterate through two lists with apply functions

I have a problem where I have a list of data frames where each column of the data frames has a name in the first row and x-s at some locations in the columns. If there is an x, then the name in the first row isviewed as selected.
In the real world problem I read an xlsx file with many sheets where each sheet contains a large matrix: each column has a name in the first row and many x-s in a somewhat sparse matrix. Each sheet becomes a data frame in a list of data frames. The row names contain an identifier which is relevant to the lookup but not to my issue as described here.

``````data1 <- data.frame(Col1 = c("Mark", "x", "", "x", "", ""),
Col2 = c("Paul", "", "", "", "x", ""),
Col3 = c("Jane", "", "", "", "", ""),
Col4 = c("Mary", "x", "x", "x", "", ""),
Col5 = c("Peter", "x", "x", "x", "", ""),
stringsAsFactors = FALSE)

data2 <- data.frame(Col1 = c("Mark", "x", "x", "", "", ""),
Col2 = c("Paul", "", "", "", "", ""),
Col3 = c("Jane", "", "", "", "", ""),
Col4 = c("Mary", "x", "", "x", "", ""),
Col5 = c("Peter", "x", "x", "", "", ""),
stringsAsFactors = FALSE)

data <- list(data1 = data1, data2 = data2)
``````

Each data frame in the list has the following structure (shown as a matrix for convenience) where the names are the same for each data frame in the list. Only the x-s are different:

``````> as.matrix(data1)
Col1   Col2   Col3   Col4   Col5
[1,] "Mark" "Paul" "Jane" "Mary" "Peter"
[2,] "x"    ""     ""     "x"    "x"
[3,] ""     ""     ""     "x"    "x"
[4,] "x"    ""     ""     "x"    "x"
[5,] ""     "x"    ""     ""     ""
[6,] ""     ""     ""     ""     ""
``````

I would like to add one column ("Approvers") to each data frame in the list that is the concatenation of the names in row 1 if there is an 'x' in the column as follows:

``````     Col1   Col2   Col3   Col4   Col5    Approvers
[1,] "Mark" "Paul" "Jane" "Mary" "Peter" ""
[2,] "x"    ""     ""     "x"    "x"     "Mark; Mary; Peter"
[3,] ""     ""     ""     "x"    "x"     "Mary; Peter"
[4,] "x"    ""     ""     "x"    "x"     "Mark; Mary; Peter"
[5,] ""     "x"    ""     ""     ""      "Paul"
[6,] ""     ""     ""     ""     ""      ""
``````

At the moment I resolve this in two steps:

1. I create another list of lists that holds the column positions of each x

2. In a nested for loop I look up all the names in the first row and concatenate them.

The code is as follows:

``````position <- lapply(data, function(x) apply(x, 1, function(y) which(y %in% "x")))
position <- lapply(position, function(x) lapply(x, function(y) {if (length(y) == 0L) return(0) else return(y)})) # remove int(0) and replace with 0
position <- lapply(position, function(x) lapply(x, function(x) paste(x, collapse = ","))) # flatten second level list into string

for (i in 1:length(data)) {
for (j in 1:nrow(data[[i]])) {
if (as.numeric(unlist(strsplit(position[[i]][[j]], ",")))[[1]] == 0) {
data[[i]][j, "Approvers"] <- ""
} else {
data[[i]][j, "Approvers"] <- paste(data[[i]][1, as.numeric(unlist(strsplit(position[[i]][[j]], ",")))], collapse = "; ")
}
}
}
``````

To me this is clumsy and I would like to do this using lapply and mapply by looping through both lists simultaneously but I cannot figure out how to do this. Also, creating the position object and collapsing the column index of the x-s into a string and seperating them in the loop is overly complicated.

We can use `lapply` to loop over the `list` then with `apply` loop over the rows and `paste` the elements of first row together where the value is `x`:

``````res <- lapply(data, function(x) {
x\$Approvers <- apply(x, 1, FUN = function(y) paste(x[1,][y =="x"], collapse=";"))
x})
res
#\$data1
#  Col1 Col2 Col3 Col4  Col5       Approvers
#1 Mark Paul Jane Mary Peter
#2    x              x     x Mark;Mary;Peter
#3                   x     x      Mary;Peter
#4    x              x     x Mark;Mary;Peter
#5         x                            Paul
#6

#\$data2
#  Col1 Col2 Col3 Col4  Col5       Approvers
#1 Mark Paul Jane Mary Peter
#2    x              x     x Mark;Mary;Peter
#3    x                    x      Mark;Peter
#4                   x                  Mary
#5
#6
``````

NOTE: It seems like the `names` of the datasset should be `Mark', 'Paul' etc. instead of 'Col1', 'Col2',..

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download