Olivia Olivia - 24 days ago 8
R Question

Understanding list behaviour

I feel like I have a good understanding of data.frames and how they work, but certain aspects of lists are confusing me.

Here is some reproducible data to start:

list_a <- structure(list(`one` = structure(list(
words = c("a", "b","c", "d", "e", "f")), .Names = "words", class = "data.frame", row.names = c(NA,-6L)),
`two` = structure(list(words = c("a","s","t","z")), .Names = "words", class = "data.frame", row.names = c(NA, -4L))),
.Names = c("one", "two"))


This gives us:

list_a
$one
words
1 a
2 b
3 c
4 d
5 e
6 f

$two
words
1 a
2 s
3 t
4 z


Now I want to loop through the list to return some of the results in the data.frames.

list <- list()

for(i in list_a){list <- append(list, list_a$i$words)}


This produces no results in list. neither does:

for(i in list_a){list <- append(list, list_a[[i]]$words)}
Error in list_a[[i]] : invalid subscript type 'list'


I thought perhaps the reason my first loop didn't work was that I was using
list_a$i$words
without defining i as the correct names. So I tried:

for(i in names(list_a)){list <- append(list, list_a$i$words)}


This still gives me a list of length 0.

So I do not understand why the attempts I tried didnt give the results I expected, I do not know why using the subscripts gave me an error and finally I figured out the correct syntax:

for(i in list_a){list2 <- append(list2, i$words)}


However I do not know why this works when using the names method did not?

Answer

The arguments to the for expression in R consists of:

  • LHS, an iterator that will take each value of RHS
  • in, a language keyword
  • RHS, a vector, the length of which defines the number of iterations that will occur.

When you set up the first loop, RHS was a length 2 vector of type "list". On the LHS you have i which is a one column data frame. You then asked $ to extract "i" from list_a, which evaluated to NULL. In your second loop, RHS was a length 2 vector of type "character". The same thing occurred.

$ does not evaluate its index. Use [[ instead and you will get the answer you expect in the second loop.

# initialize
list <- list()
# loop
for (i in names(list_a)) {
    list <- append(list, list_a[[i]]$words)
}
list
# [[1]]
# [1] "a"
#
# [[2]]
# [1] "b"
# ...

As mentioned by Roland, appending is very expensive in R, as each iteration creates a new copy of the object. Here is one alternative to try:

# create a data frame using all of list_a, 
# coerce to character vector
# then coerce to list
as.list(unname(unlist(do.call(what = "rbind", args = list_a))))

Note that "data.frame" objects are just "list" objects with the "data.frame" class attribute applied. So you will see the same behaviour when working with data.frames and $ with unevaluated names as with lists. Try this:

# print mtcars data.frame
mtcars
# set class attribute to NULL
class(mtcars) <- NULL
# mtcars is just a list now :-)
mtcars
Comments