Misha Misha - 2 months ago 13
R Question

Purrr-fused about arranging date column

I'm trying to arrange a listcolumn using purrr. But just creating a toy example is making me utterly confused:

s <- tibble(b = as.integer(runif(
n = 10, min = 0, max = 20
)))
s$e <-
map(s$b, ~ sample(seq(
as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
), size = .))


I thought I could do something like this:

s2 <- s %>% map('b') %>%
mutate(e = map(~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .)))


However, this does not work. What am I missing here?

Now, I'd like to arrange the dates in the listcolumn in ascending order and extract the first and last date. How would I do this in purrr-manner?
I've tried different variations on

s %>% map('e') %>% map_df(~arrange(.))


but clearly I'm missing something here...

My desired output is a new list-column in the dataframe
s
where the unarranged dates in the list-column
s$e
are arranged in ascending order in a new list-column
s$new_arranged_dates
.

> s
# A tibble: 10 × 3
b e new_arranged_dates
<int> <list> <list>
1 15 <date [15]> <date [15]>
2 0 <date [0]> <date [0]>
3 7 <date [7]> etc
4 6 <date [6]>
5 3 <date [3]>
6 14 <date [14]>
7 15 <date [15]>
8 13 <date [13]>
9 13 <date [13]>
10 11 <date [11]>


EDIT 290817:

s2 <- s %>%
mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .))) %>% mutate(new_arranged_dates =map(e,~.[order(.)]))


Gets me what I want. However, I do not understand why

s2 <- s %>%
mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .))) %>% mutate(new_arranged_dates=map(e,~arrange(.)))


results in

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) :
argument ".data" is missing, with no default

Answer

So- the basic error here is that arrange prefers a dataframe and will not order a vector. Coercing the looped list to a data_frame solved the problem, but it took me awhile to figure out that the name of the resulting coerced data_frame column also is .

So this works:

  library(dplyr)
  s <- tibble(b = as.integer(runif(
       n = 10, min = 0, max = 20
       )))
  s <-
  s %>% mutate(e = map(b,  ~ sample(seq(
    as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
  ), size = .)))

  s <- s2 %>% mutate(arranged = map(e,  ~ arrange(data_frame(.), .)))

Hint: creating a new function with a browser() statement that is called from map helped a lot and will probably be helpful for other people as well.

Comments