wibeasley wibeasley - 3 months ago 51
R Question

bind_rows of different data types

I'd like to stack a list of data.frames, but sometimes the columns have different data types. I'd like the operation to coerce to the lowest common denominator (which is usually

character
in my case).

This stacking occurs inside a package function that accepts almost any list of data.frames. It doesn't realistically have the ability to coerce
ds_a$x
to a character before
bind_rows()
.

ds_a <- data.frame(
x = 1:6,
stringsAsFactors = FALSE
)
ds_b <- data.frame(
x = c("z1", "z2"),
stringsAsFactors = FALSE
)

# These four implementations throw:
# Error: Can not automatically convert from integer to character in column "x".
ds_1 <- dplyr::bind_rows(ds_a, ds_b)
ds_2 <- dplyr::bind_rows(ds_b, ds_a)
ds_3 <- dplyr::bind_rows(list(ds_a, ds_b))
ds_4 <- dplyr::union_all(ds_a, ds_b)


I'd like the output to be a data.frame with a single character vector:

x
1 1
2 2
3 3
4 4
5 5
6 6
7 z1
8 z2


I have some long-term plans to use meta-data from the (REDCap) database to influence the coercion, but I'm hoping there's a short-term general solution for the stacking operation.

Answer

We can use rbindlist from data.table

library(data.table)
rbindlist(list(ds_a, ds_b))
#    x
#1:  1
#2:  2
#3:  3
#4:  4
#5:  5
#6:  6
#7: z1
#8: z2