giacomoV giacomoV - 2 months ago 12
R Question

R merge back List after computation

I am having trouble with a simple problem and I can't find a simple solution to it. (This question is probably a

duplicate
but I can't find it!)

What I need is to
merge
back a list to its original list after a computation.

I need to
merge
because the computation I am doing is too complicated to
apply
them directly to the list. So, I have to do it separately and somehow to put it back to the original dataset. (I can't use
mutate
directly here because of this problem).

Because I can't reproduce my data, I will use
mtcars
to demonstrate my problem.

I have an original list and I am applying a computation to it (it doesn't matter which one), so for example :

library(dplyr)
library(purr)


My original dataset is a list

dt = mtcars %>%
group_by(gear) %>%
split(.$gear)


Then, on this list, I do a computation, for example :

dt %>%
map(~summarise(., cluster = mean(disp)))


And I am ending up with a
list
.

The (real) structure of my data end up looking like this

$`3`
gear cluster
1 3 326.3

$`4`
gear cluster
1 4 123


and so on. What I need is simply to
merge back
this list to the original list.
How can I do this ?

What I need (output wanted) is to end up with (it's difficult to reproduce here) my original
list
and the
merged
computed values.

Something like

$`3`

mpg cyl disp hp drat wt qsec vs am gear carb cluster
1 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 XXX
2 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 XXX
3 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 XXX
4 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 XXX


and so on for all the lists (df)

I emphasise again that my original dataset is a list not a data.frame. What I need is to merge
lists
, not
data.frame
.

I thought of something like

dt = mtcars %>% # my data is a list
group_by(gear) %>%
split(.$gear)

fmerge = function(x) x %>% lapply(dt, ., by = 'gear')

dt %>%
map(~summarise(., cluster = mean(disp))) %>%
lapply(fmerge)


or

dt %>%
map(~summarise(., cluster = mean(disp))) %>%
join_all(dt, ., by = 'gear')


But it doesn't work well.

Any clue ?

Answer

We can use bind_rows to rbind the list elements and then do a right_join or left_join

mtcars %>% 
   group_by(gear) %>% 
   split(.$gear) %>% 
   map(~summarise(., cluster = mean(disp))) %>%
   bind_rows() %>%
   right_join(., mtcars, by = "gear")

However, this can be done without the split/map/bind_rows/right_join by just creating the 'cluster' with mutate after we group_by 'gear'

mtcars %>% 
     group_by(gear) %>%
     mutate(cluster = mean(disp))

However, we assume that this simplified process may not work in the OP's original dataset.

Update

Based on the OP's comments, we can use map2 to do the left_join for corresponding elements of list

dt %>%
    map(~summarise(., cluster = mean(disp))) %>% 
    map2(dt, ., left_join, by = "gear")

Or if we need a single data.frame, then use map2df

dt %>%
    map(~summarise(., cluster = mean(disp))) %>% 
    map2_df(dt, ., left_join, by = "gear")
Comments