Bonono Bonono - 3 months ago 14
R Question

Using a loop (or vectorisation) to subset a list by multiple elements in a vector

I have a list of 3

data.frame
s:

my_list <- list(a=data.frame(value= c(1:3), class=letters[1:3]),b=data.frame(value=c(4:1),class=letters[1:4]),c=data.frame(value = c(1:5),class = letters[5:1]))

my_list

$a
value class
1 1 a
2 2 b
3 3 c

$b
value class
1 4 a
2 3 b
3 2 c
4 1 d

$c
value class
1 1 e
2 2 d
3 3 c
4 4 b
5 5 a


I want to go in to each list and subset them by letters
a
and
b
from the
class
column:

wanted_sub_class <- c("a", "b")


and then put the results in a list of
my_list
per
class
.

Edit - Expected output:

$a class a
value class
1 a

$a class b
value class
2 b

$b class a
value class
4 a

$b class b
value class
3 b

$c class a
value class
5 a

$c class b
value class
4 b


I've tried to do it with a double loop:

result <- list()

for (i in 1:length(my_list)) {
for (j in wanted_sub_class {

result [[i]] <- subset(my_list[[i]], my_list[[i]]$class == j)

}
}


This should give me 6 list elements (as per expected output) but it only gives 3 and only of element
b
.

Ideally, however, if it's actually possible, I want to put the results in a list of
my_list
per
class
. So I want to keep the structure of the 3 data.frames in the list and then have a list with in that with the data of class
a
and
b
- Otherwise, a list of six will work


I understand loops aren't ideal but I can't really get my head around vecortisation (e.g. using lapply). I would appreciate an answer for both loop (if it's possible) and vectorization.

Answer

If we are using purrr from the Hadleyverse family of packages

library(purrr)
my_list %>% 
      map(~ .[.$class %in% wanted_sub_class,])
#$a
#   value class
#1     1     a
#2     2     b

#$b
#  value class
#1     4     a
#2     3     b

#$c
#  value class
#4     4     b
#5     5     a

Or if the output needs to have only 'a' and 'b' list elements

library(dplyr)
my_list %>%
       bind_rows %>%
       filter(class %in% wanted_sub_class) %>% 
       split(., .$class)
#$a
#  value class
#1     1     a
#3     4     a
#6     5     a

#$b
#  value class
#2     2     b
#4     3     b
#5     4     b

Update

Based on the OP's update

my_list %>%
       map(~ .[.$class %in% wanted_sub_class,]) %>%
       map(~split(.x, seq_len(nrow(.x)))) %>%
       do.call("c", .)
#$a.1
#  value class
#1     1     a

#$a.2
#  value class
#2     2     b

#$b.1
#  value class
#1     4     a

#$b.2
#  value class
#2     3     b

#$c.1
#  value class
#4     4     b

#$c.2
#  value class
#5     5     a

Or using the bind_rows approach

my_list %>%
    bind_rows %>%
    filter(class %in% wanted_sub_class) %>% 
    split(., seq_len(nrow(.)))

Update2

If we need a for loop

result <- setNames(vector('list', length(my_list)), names(my_list))
for(i in seq_along(my_list)){
  result[[i]] <- subset(my_list[[i]], class %in% wanted_sub_class)
  result[[i]] <- split(result[[i]], 1:nrow(result[[i]]))
 }