zeehio - 4 months ago 28

R Question

I have a tibble as follows:

`uuu <- structure(list(IsCharacter = c("a", "b"),`

ShouldBeCharacter = list("One", "Another"),

IsList = list("Element1", c("Element2", "Element3"))

),

.Names = c("IsCharacter", "ShouldBeCharacter", "IsList"),

row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))

uuu

## A tibble: 2 × 3

# IsCharacter ShouldBeCharacter IsList

# <chr> <list> <list>

#1 a <chr [1]> <chr [1]>

#2 b <chr [1]> <chr [2]>

I would like to convert columns like "ShouldBeCharacter", where all the elements are of the same length and type into a column similar to "IsCharacter", leaving the rest of the columns untouched.

So far I have the following function that solves the problem, but it looks quite hacky to me. I would like to know if there is a better solution I am not considering:

`lists_to_atomic <- function(data) {`

# Elements of length larger than one should be kept as lists.

# So we compute the maximum length for each column

length_column_elements <- apply(data, 2,

function(x) max(sapply(x, function(y) length(y))))

# to_simplify will contain column names of class list and with all elements of length 1

to_simplify <- colnames(data)[length_column_elements == 1 & sapply(data, class) == "list"]

# Do the conversion

data[,to_simplify] <- tibble::as_tibble(lapply(as.list(data[,to_simplify]), function(x) {do.call(c, x)}))

return(data)

}

Here is the result I obtain, note how the type of ShouldBeCharacter has changed:

`lists_to_atomic(uuu)`

## A tibble: 2 × 3

# IsCharacter ShouldBeCharacter IsList

# <chr> <chr> <list>

#1 a One <chr [1]>

#2 b Another <chr [2]>

The

`as_tibble(lapply(as.list(... do.call(c,...)))`

Is there any simplification that makes my

`lists_to_atomic`

I did not consider using

`tidyr::unnest`

`lists_to_atomic <- function(data) {`

# Elements of length larger than one should be kept as lists.

# So we compute the maximum length for each column

length_column_elements <- apply(data, 2,

function(x) max(sapply(x, function(y) length(y))))

# to_simplify will contain column names of class list and with all elements of length 1

to_simplify <- colnames(data)[length_column_elements == 1 & sapply(data, class) == "list"]

# Do the conversion

data2 <- tidyr::unnest_(data, unnest_cols = to_simplify)

data2 <- data2[, colnames(data)] # Preserve original column order

return(data2)

}

Answer

You can try:

```
library(tidyr)
uuu %>% unnest(ShouldBeCharacter)
```

More examples how to deal with list columns can be found in "R for Data Science": http://r4ds.had.co.nz/many-models.html#list-columns-1