Kevin Kevin - 3 years ago 176
R Question

Combine every ith element of a list of lists together using dplyr, purrr

I have a list of identically structured lists as follows:

test1 <- list(first = data.frame(col1 = c(1,2), col2 = c(3,4)),
second = data.frame(COL1 = c(100,200), COL2 = c(300, 400)))

test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)),
second = data.frame(COL1 = c(500,600), COL2 = c(700,800)))

orig.list <- list(test1, test2)


I want to:


  1. Bind the rows the first element of each nested list together, bind the rows 2nd element of each nested list together, etc.

  2. Recombine the resulting elements into a single list with an identical structure to the first list.



I can easily do this element by element via:

firsts <- orig.list %>% purr::map(1) %>% dplyr::bind_rows()
seconds <- orig.list %>% purr::map(2) %>% dplyr::bind_rows()

new.list <- list(first = firsts, second = seconds)


However, for n list elements this requires that I:


  1. know the number of elements in each list,

  2. know the names and orders of the elements so I can recreate the new list with the correct names and order,

  3. copy and past the same line of code over and over again.



I'm looking for how to apply purrr:map (or some other tidyverse function) more generically to combine all elements of a list of lists, preserving the element names and order.

Answer Source

Under the simplest cases as you've shown with your data, you can use pmap to walk through the list in parallel and bind_rows to combine individual data frames:

library(tidyverse)
pmap(orig.list, bind_rows)

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400
#3  500  700
#4  600  800

identical(pmap(orig.list, bind_rows), new.list)
# [1] TRUE

To make this a little bit more generic, i.e. handles cases where the number of elements and order of names in each sublist can vary, you can use:

map(map_df(orig.list, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

i.e. you nest each sub list as a data frame, and let bind_rows to check the names for you.

Test Cases:

With test1 the same, switch the order of the elements in test2:

test2 <- list(second = data.frame(COL1 = c(500,600), COL2 = c(700,800)),
              first = data.frame(col1 = c(5,6), col2 = c(7,8)))

orig.list1 <- list(test1, test2)

map(map_df(orig.list1, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

gives:

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400
#3  500  700
#4  600  800

Now drop one element from test2:

test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)))
orig.list2 <- list(test1, test2)

map(map_df(orig.list2, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

gives:

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download